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IV 


Foreword... 


The purpose of this book is to complement the lectures and thereby decrease, 
but not eliminate, the necessity of taking lecture notes. Reading the appropri¬ 
ate sections of the book before each lecture should enable you to understand 
the lecture as it is being given, provided you concentrate! This is particularly 
important in this course because, as theoretical machinery is developed, the 
lectures depend more and more heavily upon previous lectures, and students 
who fail to thoroughly learn the new concepts as they are introduced soon 
become completely lost. 

-k-k-k Proofs of the theorems are an important part of this 

course. You cannot expect to do third year Pure Mathematics 
without coming to grips with proofs. Mathematics is about proving 
theorems. You will be required to know proofs of theorems for the 
exam. - k-k-k 

R is the material dealt with in the lectures, not this book, which defines the 
syllabus of the course. The book is only intended to assist, and how much 
overlap there is with the course depends on the whim of the lecturer. There 
will certainly be things which are in the lectures and not in the book, and 
vice versa. The lecturer will probably dwell upon topics which are giving 
students trouble, and omit other topics. However, the book will still provide 
a reasonable guide to the course. 


v 
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Prerequisites 


Students will be assumed to be familiar with the material mentioned in this 
preliminary chapter. Anyone who is not should inform the lecturer forthwith. 


Oa Concerning notation 


When reading or writing mathematics you should always remember that the 
mathematical symbols which are used are simply abbreviations for words. 
Mechanically replacing the symbols by the words they represent should result 
in grammatically correct and complete sentences. The meanings of a few 
commonly used symbols are given in the following table. 


Symbols 

{•••!•••} 


G 

> 


To be read as 

the set of all ... such that ... 
is 

in or is in 

greater than or is greater than 


Thus for example the following sequence of symbols 

is an abbreviated way of writing the sentence 
The set of all x in X such that x is greater than a is not the empty set. 


When reading mathematics you should mentally translate all symbols in this 
fashion. If you cannot do this and obtain meaningful sentences, seek help 
from your tutor. And make certain that, when you use mathematical symbols 
yourself, what you write can be translated into meaningful sentences. 


1 
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§0b Concerning functions 

The terminology we use in connection with functions could conceivably differ 
from that to which you are accustomed; so a list of definitions of the terms 
we use is provided here. 

• The notation ‘/: A B' (read ‘/, from A to B ’) means that / is a 
function with domain A and codomain B. In other words, / is a rule which 
assigns to every element a of the set A an element in the set B denoted 
by 7(a)’- 

• A map is the same thing as a function. The term mapping is also used. 

• A function f: A —> B is said to be injective (or one-to-one ) if and only 
if no two distinct elements of A yield the same element of B. In other words, 
/ is injective if and only if for all Oi, 02 € A, if f(a 1 ) = f(a 2 ) then a\ = 02 - 

• A function f: A —> B is said to be surjective (or onto) if and only if for 
every element b of B there is an a in A such that /(a) = b. 

• If a function is both injective and surjective we say that it is bijective 
(or a one-to-one correspondence). 

• The image of a function f: A —> B is the subset of B consisting of all 
elements obtained by applying / to elements of A. That is, 

im/ = {/(a) | a e A}. 

An alternative notation is i f(Ay instead of ‘im/’. Clearly, / is surjective if 
and only if im f = B. 

• The notation ‘a 1 —> b' means ‘o maps to in other words, the function 
involved assigns the element b to the element a. Thus ‘a 1 —> b under /’ means 
exactly the same as 7( a ) = b\ 

• If /: A —> B is a function and C a subset of B then the inverse image 
or preimage of C is the subset of A 

f- 1 (C) = {aeA\f(a)eC}. 

(The above line reads ‘/ inverse of C, which is the set of all a in A such that 
/ of a is in C ' Alternatively, one could say ‘The inverse image of C under 
/’ instead of ‘/ inverse of C\) 
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§0c Concerning vector spaces 

Vector spaces enter into this course only briefly; the facts we use are set out 
in this section. 

Associated with each vector space is a set of scalars. In the common 
and familiar examples this is M, the set of all real numbers, but in general it 
can be any field. (Fields are defined in Chapter 2.) 

Let V be a vector space over F. (That is, F is the associated field of 
scalars.) Elements of V can be added and multiplied by scalars: 

(*) If v, w G V and A G F then v + w, Xv G V. 

These operations of addition and multiplication by scalars satisfy the follow¬ 
ing properties: 

(i) (u + v) + w — u + (v + w) for all u, v, w G V. 

(ii) u + v — v + u for all u, v G V. 

(iii) There exists an element 0 G V such that v + 0 — v for all v G V. 

(iv) For each v G V there exists an element — v G V such that v + (—v) = 0. 

(v) X(f.w) = (Xfi)v for all A, fi G F and all v G V. 

(vi) lv — v for all v G V. 

(vii) X(v + w) — Xv + Xw for all A G F and all v, w G V. 

(viii) (A + fj,)v = Xv + f.iv for all A, // G F and all v G V. 

The properties listed above are in fact the vector space axioms; thus in 
order to prove that a set V is a vector space over a field F one has only to 
check that (*) and (i)—(viii) are satisfied. 

Let V be a vector space over F and let V 2 , ■ ■ ■ v n G V. The elements 
vi, V 2 , ■ ■ ■ v n are said to be linearly independent if the following statement is 
true: 

If Ai, A 2 , ..., A n G F and Ai^i + X 2 V 2 + • • • + X n v n — 0 
then Ai = 0, A 2 = 0, ..., A n — 0. 

The elements v\, V 2 , ... v n are said to span the space V if the following 
statement is true: 

For every v G V there exist Ai, X 2 , ■ ■ ■, X n E F 
such that v — A 1 U 1 + X 2 V 2 + ■ ■ ■ + X n v n . 

A basis of a vector space V is a finite subset of V whose elements are linearly 
independent and span V. 
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We can now state the only theorem of vector space theory which is used 
in this course. 

0.1 Theorem If a vector space V has a basis then any two bases ofV will 
have the same number of elements. 

Comment »> 

0.1.1 If V has a basis then the dimension of V is by definition the 
number of elements in a basis. »> 


§0d Some very obvious things about proofs 

When trying to prove something, the logical structure of what you are try¬ 
ing to prove determines the logical structure of the proof. The following 
observations seem trivial, yet they are often ignored. 

• To prove a statement of the form 

If p then q 

your first line should be 

Assume that p is true 

and your last line 

Therefore q is true. 

• The statement 

p if and only if q 

is logically equivalent to 

If p then q and if q then p, 

and so the proof of such a statement involves first assuming p and proving q. 
then assuming q and proving p. 

• To prove a statement of the form 

All xxxx's are yyyy 1 s, 
the first line of your proof should be 

Let a be an xxxx 

and the last line should be 

Therefore a is a yyyy. 

(The second line could very well involve invoking the definition of ‘ xxxx ’ 
or some theorem about xxxx's to determine things about a; similarly the 
second to last line might correspond to the definition of l yyyy\) 
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When trying to construct a proof it is sometimes useful to assume 
the opposite of the thing you are trying to prove, with a view to obtaining 
a contradiction. This technique is known as “indirect proof” (or “proof 
by contradiction”). The idea is that the conclusion c is a consequence of 
the hypotheses hi, h, 2 , ..., if and only if the negation of c is incompatible 
with hi, hi, ■■■ ■ Hence we may assume the negation of c as an extra 
hypothesis, along with hi, h 2 etc., and the task is then to show that the 
hypotheses contradict each other. Note, however, that although indirect 
proof is a legitimate method of proof in all situations, it is not a good policy 
to always use indirect proof as a matter of course. Most proofs are naturally 
expressed as direct proofs, and to recast them as indirect proofs may make 
them more complicated than necessary. 

- Examples - 

#1 Suppose that you wish to prove that a function A: X —> Y is injective. 
Consult the definition of injective. You are trying to prove the following 
statement: 

For all xi, Xi G X, if A(aq) = A^) then x\ = aq. 

So the first two lines of your proof should be as follows: 

Let xi, X 2 G X. 

Assume that A(aq) = A^)- 

Then you will presumably consult the definition of the function A to derive 
consequences of A(aq) = A(# 2 ), an d eventually you will reach the final line 

Therefore Xi = Xi- 


#2 Suppose you wish to prove that A: X —» Y is surjective. That is, you 
wish to prove 

For every y G Y there exists x G X with X(x) — y. 

Your first line must be 

Let y be an arbitrary element of Y. 

Somewhere in the middle of the proof you will have to somehow define an 
element x of the set X (the definition of x is bound to involve y in some 
way), and the last line of your proof has to be 

Therefore A(x) = y. 
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#3 Suppose that A and B are sets, and you wish to prove that A C B. 
(That is, A is a subset of or equal to B .) By definition the statement L A C B' 
is logically equivalent to 

All elements of A are elements of B. 

So your first line should be 

Let x € A 

and your last line should be 

Therefore x e B. 

#4 Suppose that you wish to prove that A = £>, where A and B are sets. 
The following statements are all logically equivalent to 'A = B'\ 

(i) For all x, x G A if and only if x G B. 

(ii) (For all a;) ((if x G A then x G B) and (if x G B then x G A)). 

(iii) All elements of A are elements of B and all elements of B are elements 
of A. 

(iv) AC B and B C A. 

You must do two proofs of the general form given in ^3 above. 



1 

Ruler and compass constructions 


Abstract algebra is essentially a tool for other branches of mathematics. 
Many problems can be clarified and solved by identifying underlying struc¬ 
ture and focussing attention on it to the exclusion of peripheral information 
which may only serve to confuse. Moreover, common underlying structures 
sometimes occur in widely varying contexts, and are more easily identifiable 
for having been previously studied in their own right. In this course we shall 
illustrate this idea by taking three classical geometrical problems, translat¬ 
ing them into algebraic problems, and then using the techniques of modern 
abstract algebra to investigate them. 


§la Three problems 

Geometrical problems arose very early in the history of civilization, presum¬ 
ably because of their relevance to architecture and surveying. The most 
basic and readily available geometrical tools are ruler and compass, for con¬ 
structing straight lines and circles; thus it is natural to ask what geometrical 
problems can be solved with these tools.f 

It is said that the citizens of Delos in ancient Greece, when in the 
grips of a plague, consulted an oracle for advice. They were told that a god 
was displeased with their cubical altar stone, which should be immediately 
replaced by one double the size. The Delians doubled the length, breadth 
and depth of their altar; however, this increased its volume eightfold, and 
the enraged god worsened the plague. 

Although some historians dispute the authenticity of this story, the 
so-called “Delian problem” 


f Note that the ruler is assumed to be unmarked; that is, it is not a measuring 
device but simply an instrument for ruling lines. 


7 
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(1) Given a cube, construct another cube with double the volume 

is one of the most celebrated problems of ancient mathematics. There are 
two other classical problems of similar stature: 

(2) Construct a square with the same area as a given circle 

(3) Trisect a given angle. 

In this course we will investigate whether problems (1), (2) and (3) can be 
solved by ruler and compass constructions. It turns out that they cannot. 

We should comment, however, that although the ancient mathemati¬ 
cians were unable to prove that these problems were insoluble by ruler and 
compass, they did solve them by using curves other than circles and straight 
lines. 


§lb Some examples of constructions 

Before trying to prove that some things cannot be done with ruler and com¬ 
pass, we need to investigate what can be done with those tools. Much of 
what follows may be familiar to you already. 

#1 Given straight lines AB and AC intersecting at A the angle BAC can 
be bisected, as follows. Draw a circle centred at A, and let X, Y be the 
points where this circle meets AB, AC. Draw circles of equal radii centred 
at X and Y, and let T be a point of intersection of these circles. (The radius 
must be chosen large enough so that the circles intersect.) Then AT bisects 
the given angle BAC. 

#2 Given lines AB and AC intersecting at A and a line PQ, the angle 
BAC can be copied at P, as follows. Draw congruent circles Ca, Cp centred 
at A and P. Let Ca intersect AB at X and AC at Y. and let Cp intersect 
PQ at V. Draw a circle with centre V and radius equal to XY, and let T 
be a point of intersection of this circle and Cp. Then the angle TPQ equals 
the angle BAC. 

#3 Given a point A and a line PQ, one can draw a line through A parallel 
to PQ. Simply draw any line through A intersecting PQ at some point X , 
and then copy the angle AXQ at the point A. 

#4 Given a line AB one can construct a point T such that the angle TAB 
equals | radians (60 degrees). Simply choose T to be a point of intersection 
of the circle centred at A and passing through B and the circle centred at B 
and passing through A. 
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9^5 Given line segments of lengths r, s and t one can construct a line 
segment of length rt/s, as follows. Draw distinct lines AP, AQ intersecting 
at A and draw circles C r . C s and C t of radii r, s and t centred at A. Let C r 
intersect AP at B and let C s , Ct intersect AQ at X , Y. Draw a line through 
Y parallel to XB. and let C be the point at which it intersects AP. Then 
AC has the required length. 

9^6 There are simple constructions for angles equal to the sum and dif¬ 
ference of two given angles, lengths equal to the sum and difference of two 
given lengths, and for a/n and na, where a is a given length and n a given 
positive integer. See the exercises at the end of the chapter. 

#7 Given line segments of lengths a and 6 , where a > b, it is possible 
to construct a line segment of length \fab , as follows. First, construct line 
segments of lengths rq = ^(o + b) and rq = |(a — b), and draw circles of 
radii rq and ?q with the same centre O. Draw a line through 0 intersecting 
the smaller circle at P, and draw a line through P perpendicular to OP. (A 
right-angle can be constructed, for instance, by constructing an angle of 
bisecting it, and adding on another angle of |.) Let this perpendicular meet 
the large circle at Q. Then PQ has the required length. 

Further ruler and compass constructions are dealt with in the exercises. 


lc Constructible numbers 


Consider the Delian Problem once more: we are given a cube and wish to 
double its volume. We may as well choose our units of length so that the 
given cube has sides of length one. Then our problem is to construct a line 
segment of length \/2. The other problems can be stated similarly. A circle of 
unit radius has area 7r; to construct a square of this area one must construct 
a line segment of length ypit. A right-angled triangle with unit hypotenuse 
and an angle 6 has other sides cos# and sin#; to trisect 6 one must construct 
cos(|). So the problems become: 

(1) Given a unit line segment, construct one of length ^/tt. 

(2) Given a unit line segment, construct one of length v^2. 


(3) Given line segments of lengths 1 and cos#, construct one of length 

:!)• 


cos 
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To show that Problem 3 cannot be solved by ruler and compass, it will 
be sufficient to show that it cannot be done in the case 6 = In this case 
cos# = Since a line segment of length \ can be constructed given a unit 
line segment, it suffices to show that given only a unit line segment it is not 
possible to construct one of length cos^). In other words, an angle of 20 
degrees cannot be constructed. 

So, assume that we are given a line segment of length one. We first 
use this segment to define a coordinate system. Let one of the endpoints 
of the segment be the origin (0,0) and the other endpoint the point (1,0). 
After drawing a line through (0, 0) perpendicular to the x-axis we can find the 
position of the point (0,1) by drawing a circle of centre (0, 0) and radius 1. We 
can now proceed to construct further points, lines and circles, in accordance 
with the following rules. We can construct 

(a) a line if it passes through two previously constructed points, 

(b) a circle if its centre is a previously constructed point and its radius the 
distance between two previously constructed points, 

(c) a point if it is the point of intersection of two lines or circles or a circle 
and a line constructed in accordance with (a) and (b). 

We now a define a number to be constructible if it is a coordinate of a con- 
structible point. (Note that since lines perpendicular to the axes can be 
constructed, the point (a, h) can be constructed if and only if the points 
(a, 0) and (0, h) can both be constructed. Furthermore, since a circle of ra¬ 
dius a and centre O cuts the x-axis at (a, 0) and the y-axis at (0, a), a number 
is constructible as an .x-coordinate if and only if it is constructible as a y- 
coordinate.) Our aim will be to describe completely the set of constructible 
numbers and hence show that y/n, \/2 and cos(|) are not constructible. 

1.1 Theorem If a and b are constructible numbers then so are a + b, —a, 
ab, a -1 (if a yf 0) and yfa (if a > 0). 

Proof. Let a and b be constructible numbers. Then the points (a, 0) and 
(0,6) can be constructed in accordance with rules (a), (b) and (c) above. 
The point (a + 6, 0) is the point of intersection of the x-axis and a circle 
centre (a, 0) and radius the distance between (0,0) and (0,6); hence it is 
constructible by rule (b) above. So a + 6 is constructible. 

Draw the line joining (0,6) and (1,0). Using the process described in 
§lb the line through (a, 0) parallel to this can be constructed. It meets the 
y -axis at (0,a6). Hence ab is constructible. 
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The proofs of the other parts are similarly based on constructions given 
in § 1 b, and are omitted. □ 

Comment t>t» 

1.1.1 Sometimes standard geometrical constructions include instruc¬ 
tions like ‘Draw an arbitrary line through A ’ or ‘Draw any circle with centre 
B and radius large enough to intersect with PQ\ It is not immediately ob¬ 
vious that the rules (a), (b) and (c) given above are strong enough to permit 
constructions such as these. However, it follows from the above theorem 
that every element of the set Q of all rational numbers (numbers of the form 
n/m where n and m are integers) is constructive. When asked to draw an 
arbitrary line through A we may as well join A to a point with rational co¬ 
ordinates, and when asked to draw an arbitrary circle with centre B we may 
as well draw one that has rational radius. There will always be a rational 
number of suitable size, since rational numbers exist which are arbitrarily 
close to any given real number. So in fact anything that can be constructed 
with ruler and compass can be constructed by just following rules (a), (b) 
and (c). »> 

Obviously one cannot draw an infinite number of circles and/or lines; 
so the number of points obtained in any geometrical construction is finite. 
Suppose that a i, « 2 , ..., a n are the points occurring in a given construc¬ 
tion, listed in the order in which they are constructed, with ckq = (0, 0) and 
cti = (1,0). Let the coordinates of cp be ( ay, ) (for each i). Suppose that 
we now wish to construct another point. According to the rules we can draw 
a circle with centre cp and radius equal to the distance between ay and a k 
(for any choice of i, j and k) and we can draw a straight line joining a* and 
a,j (for any i and j). The points of intersection of such points and lines are 
the only points that we can construct at the next stage. (We can, of course, 
get further points by repeating the process.) The equation of such a circle is 

(1) (x- at ) 2 + (y - bi ) 2 = (ay - a k ) 2 + (,bj - b k ) 2 
and the equation of such a line is 

(2) (bj - bi)x - (aj - ai)y = aibj - a :j b t . 

Hence the coordinates of the next point obtained can be found by solving 
simultaneously two equations each having one or other of the above forms. 
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1.2 Theorem Let a\, ot2, ■ ■ ■, o>k he the points obtained in a ruler and 
compass construction, listed in the order obtained. For each n let S n be the 
set of all real numbers obtainable from the coordinates of a i, 0 : 2 , ■■■, a,, 
by finit e sequences of operations of addition, subtraction, multiplication and 
division. Then there exists a n G S n such that the coordinates of a n +1 lie in 
the set S n ( x /an) = { p + q^fon \p,qE S n }. 

Proof. As described in the preamble, the coordinates of ct n +i are obtained 
by solving simultaneously two equations like (1) or (2). There are three cases 
to consider: both of the form (1), both of the form (2), and one of each. Let 
us deal with the last case first. 

On expanding, (1) has the form 

(3) x 2 + y 2 + px + qy + r — 0 
with p, q, r G S n . Similarly, (2) has the form 

(4) sx + ty + u = 0 

with s, t, u G S n and either s ^ 0 or t ^ 0. If s ^ 0, rewrite (4) as 

(5) x = ~ !J ~ " 

s s 

and substitute into (3). This gives 

(6) p'y 2 + q'y + r' — 0, 


where in fact 



but all that concerns us is that p' , q', r' G S n and p' ^ 0. Let a n — ( q') 2 —Ap'r ', 
an element of S n . Using the quadratic formula to solve (6) shows that the 
y-coordinate of a n +1 is in S n (^/a n ) and then (5) shows that the x-coordinate 
is too. 
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In the case that both equations have the form (1) (so that a n +1 is the 
point of intersection of two circles) we must solve simultaneously equation 
(3) and a similar equation 

(7) x 2 + y 2 + lx + my + n — 0. 

But subtracting (7) from (3) gives an equation of the form (4); so we can 
proceed as before. 

In the remaining case both equations have the form (4), with coeffi¬ 
cients in the set S n . To solve them just involves operations of addition, 
subtraction, multiplication and division, and since by definition these oper¬ 
ations cannot take us outside the set S n it follows that the coordinates of 
a n+ i lie in S n = S n (\/0). □ 

What Theorems 1.1 and 1.2 show is, essentially, that with ruler and 
compass one can add, subtract, multiply, divide and take square roots, and 
nothing else. To solve the Delian Problem one must construct the cube root of 
two. You may think that we have already settled the matter, since obviously 
it is impossible to find a cube root by taking square roots. Unfortunately, 
however, this is not obvious at all. How do you know, for instance, that the 
following formula is not correct? 

a 3 + 4 b 3 

3 a 2 b — 2 ab 2 + 2b^/2b‘ i + a 2 b 2 — a 3 b 

where _ 

a = 8 + 5VT0 

b = —10 + 6VT0 + \j 225 — 40VTo. 

Or if you can show that it is not, how do you know that there is not some far 
more complicated formula of the same kind which is correct? The algebraic 
machinery which will be developed in the subsequent chapters will show that 
there is not. 



Exercises 

1. Describe carefully how to perform the constructions mentioned in #6. 

(Hint: For a/n and na use #5.) 
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2 . Given a line segment of length 1 unit, construct a line segment of each 
of the following lengths: 

(i) V2 (ii ) (in) V3-V2 

(iv) \f2\f?> ( v ) (vi) \J y/2 + y/3. 

Measure the line segments in each case to check the accuracy of your 
constructions. 


3 . Let 6 = 27 t /5 and let a — e ld = cos 9 + isind, where i is a complex 
square root of —1. Thus ct is a complex fifth root of 1. Show that 

x A + x 3 + x 2 + x + 1 = (x — a)(x — a~ x )(x — a 2 )(x — a~ 2 ) 

— (x 2 — 2(cos 9)x + l)(x 2 — 2(cos2 G)x + 1). 


Hence show that 


cos 6 + cos 29 


and solve to find cos 9. 


cos 9 cos 29 


1 

2 

1 

4 


4. Is it possible to construct an angle of 27r/5? 

5. Let OAB be an isosceles triangle with OA = OB and the angle at O 

equal to 7 t/ 5. Let the bisector of the angle OB A meet OA at the point P. 

(i) Prove that the triangles OAB and BAP are similar. 

(ii) Suppose that OA has length 1 and let AB have length x. Use the 
first part to prove that x 2 + x — 1 = 0. 

(Hint: Prove that PA has length 1 — x.) 

(in) Use the previous part to show that a regular decagon inscribed in 
a unit circle has sides of length v ^~ 1 , and hence devise a ruler and 
compass construction for a regular decagon. 

6. Let 9 = 27t/ 17 and let u> — e td — cosd + isind, a complex 17 th root of 1. 

Prove that 

x 16 + a: 15 + x 14 H-+ x 2 + x + 1 

= (x — u>)(x — uj~ 1 )(x — uj 2 )(x — uj~ 2 ) ... (x — co 8 )(x — to~ 8 ) 

— (x 2 — (2 cos 9)x + l)(a: 2 — (2 cos 2 9)x + 1)... (x 2 — (2 cos 8 9)x + 1). 
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Let 

-1 + VT 7 - 1 -vTr 

OL\ — -, Ot 9 — - 

2 ’ 2 

Pi = -(«i + \/o^+4) 

P2 = 2 ( ai ~\J a i + 4 ) 

$3 = 2 ( a 2 + \/«2 + 4) 

At = +4) 

71 = - 4 /?3) 75 = ^(/?3 + \Jpl-±P i) 

72 = - ^1 — 4/?s) 76 = ^(/?3 - \Jpl-±Pi) 

73 = ^(/?2 + -4^4) 77 = ^(/?4 + \Jpl-lfc) 

74 = ^(/?2 - -4/? 4 ) 7s = ^(/?4 - \JPi-4:f3 2 ). 

(i) Check that 71 + 72 = /?i and 7172 = /? 3 , and hence show that 

(cc 2 — 71X + l)(cc 2 — 72X + 1) = cc 4 — /?io; 3 + (2 + p 3 )x 2 — ( 5 \x + 1. 

Similarly 

(cc 2 — 73X + l)(cc 2 — 74X + 1) = CC 4 — /?2^ 3 + (2 + 2 — fox + 1, 

(x 2 - 75 X + l)(cc 2 - 76 ^ + 1 ) = x 4 - (3 3 x 3 + (2 + /?i)cc 2 - (3 3 x + 1 , 

(x 2 — 77 X + l)(x 2 — 7 8 x + 1 ) = x 4 — /Llo ; 3 + (2 + p 2 )x 2 — P±x + 1 . 

(n) Check that 


(x 4 — P\x 3 + (2 + P 3 )x 2 — P\x + l)(x 4 — P 2 X 3 + (2 + fi^x 2 — P 2 X + 1 ) 



The product of the other two quartics appearing in part {%) is 
similar: just replace —\/Yf by \/l7. 
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(Hi) Multiply the eighth degree polynomial appearing in part (ii) by its 
conjugate (obtained by replacing — \/l7 by \Zl7) and show that the 
result is x 16 + x 15 + x 14 + • • • + x 2 + x + 1. Comparing with the pre¬ 
vious exercise, deduce that the numbers 71 , 72 , ..., 78 are equal to 
2 cosd, 2 cos 26, ..., 2 cos 8 d (not necessarily in that order), where 
6 = 2tt/17. 

(iv) Use the previous parts to deduce that a regular seventeen sided 
polygon can be constructed with ruler and compass. 



2 

Introduction to rings 


In this chapter we introduce the concepts which will be fundamental to the 
rest of the course, and which are necessary to adequately understand the set 
of constructible numbers. 


§2a Operations on sets 

If a and b are real numbers their sum a + b and product ab are also real 
numbers. Addition and multiplication are examples of operations on the set 
of real numbers. Operations can be defined in many different ways on many 
different sets. For example, division of nonzero real numbers, addition and 
multiplication of 2 x 2 matrices over M (where M is the set of all real numbers) 
and so on. Let us state precisely what is meant by ‘operation’: 

2.1 Definition An operation on a set S' is a rule which assigns to each 
ordered pair of elements of S' a uniquely determined element of S. 

Thus, for example, addition is the rule which assigns to the ordered 
pair (a, b) of real numbers the real number a + b. Since the set of all ordered 
pairs of elements of S is usually denoted by ‘S' x S\ 

SxS = {(a,b) | aeS and beS}, 

we could alternatively state Definition 2.1 as follows: an operation on S is 
a function S x S —> S. Addition on M is the function 


IxR —> R 

(a, b) i—> a + b. 

In this course we will be concerned with many examples of sets equipped 
with two operations which have properties resembling addition and multipli¬ 
cation of numbers. The confusing thing is that sometimes some of the familiar 


17 



18 Chapter Two: Introduction to rings 


properties are not satisfied. For example, addition and multiplication can be 
defined on all the following sets: 

M (real numbers) 

Z (integers) 

Mat(2,M) (2x2 matrices whose entries are real numbers) 

2Z (even integers) 

M[X1 (polynomials in X with real coefficients—expressions 

like 2 + 5X + X 2 ). 

Each of these sets possesses a zero element; that is, an element 0 such that 
a + 0 — a — 0 + a for all elements a in the set. In four of the five examples 
the product of two nonzero elements is nonzero; however, this property fails 
for Mat(2, M). Similarly, in four of the examples there is an identity element; 
that is an element 1 such that al = a — la for all a. However, this property 
fails for 2Z (no even integer is an identity element). In M for any two nonzero 
elements a and b there is another element c such that a — be. None of the 
other sets have this property. And the rule that a/3 — f3a for all a and (3 is 
not satisfied in Mat(2, M) but is in all the others. 

We attempt to bring some order to this chaos by listing the most im¬ 
portant properties, investigating which properties are consequences of which 
other properties, and classifying the various systems according to which prop¬ 
erties hold. 

§2b The basic definitions 

2.2 Definition A ring is a set R together with two operations on R , 
called addition ((a, b) i—> a + b) and multiplication ((a, b) i—>• ab ), such that 
the following axioms are satisfied: 

(i) (a + b) + c — a + {b + c) for all a, 6, c G R. 

(That is, addition is associative .) 

(ii) There is an element 0 E R such that a + 0 = 0-|-a = a for all a E R. 

(There is a zero element.) 

(iii) For each a E R there is a b E R such that a + & = 6 -|-a = 0. 

(Each element has a negative.) 

(iv) a + b = b + a for all a, b E R. (Addition is commutative.) 

(v) ( ab)c — a(bc) for all a, b, c E R. (Multiplication is associative.) 

(vi) a(b + c) — ab + ac and (a + b)c = ac + be for all a, b, c E R. 

(Both distributive laws hold.) 
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Comment »> 

2.2.1 The five examples mentioned in §2a above are all rings. »P 

2.3 Definition A commutative ring is a ring which satisfies ah = ba for 
all elements a, b. 

That is, a commutative ring is a ring which satisfies the commutative law for 
multiplication. (All rings satisfy the commutative law for addition.) 

2.4 Definition If R is a ring an element e E R is called an identity if 
ea — ae — a for all a E R. 

Comment »P 

2.4.1 We will almost always use the symbol ‘T rather than ‘e’ to denote 

an identity element. Not all rings have identity elements—for example 2Z 
does not have one. >» 

2.5 Definition If R is a commutative ring and a E R is such that a ^ 0 
and ab — 0 for some nonzero b E R then a is called a zero divisor. 

2.6 Definition An integral domain is a commutative ring which has an 
identity element which is nonzero (that is, 1^0) and no zero divisors. 

Thus, in an integral domain the following property holds: 

2.6.1 For all a, 6 , if a ^ 0 and & ^ 0 then ab ^ 0. 

2.7 DEFINITION Let R be a ring with an identity and let a E R. An element 
b E R is called a multiplicative inverse of a if ab = ba = 1. 

Comment »> 

2.7.1 It can be proved—see the exercises at the end of the chapter—that 

if an element has a multiplicative inverse then the inverse is unique. That is, 
if b, c E R satisfy ab = ba — 1 and ac — ca — 1 then b = c. This fact means 
that there is no ambiguity in using the usual notation ‘a -1 ’ for the inverse 
of an element a which has an inverse. (Compare with the remarks following 
Theorem 2.9 below.) >» 
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2.8 Definition A field is a commutative ring in which there is a nonzero 
identity element, and every nonzero element has a multiplicative inverse. 

Thus, iu a held the following property holds: 

2 g i For all a ^ 0 there exists a b such that 

ab — ba — 1 (where 1 is the identity element). 


We wish to investigate properties of the set of constructible numbers, 
which, as it happens, is a held. However, it turns out to be necessary to 
study other rings first before we can adequately describe and understand the 
relevant properties of constructible numbers. To familiarize ourselves with 
the various concepts we start by considering some examples. 

#1 Fields 

By definition a held satishes all the ring axioms, and also 

(i) multiplication is commutative, 

(ii) there exists an identity element 1 ^ 0 , 

(iii) all nonzero elements have multiplicative inverses. 

The following are helds: 

M (the set of all real numbers), 

Q (the set of all rational numbers), 

C (the set of all complex numbers), 

Q[\/ 2 ] (the set of all numbers of the form a + bV2 for a, b E Q), 

Con (the set of all real numbers which are constructible— 
that is, { a G M | a is constructible }). 

(It is straightforward to check that M and Q satisfy the held axioms; a little 
more work is needed for the other two examples.) 

#2 Integral domains 

By definition an integral domain satishes all the ring axioms, and also 

(i) multiplication is commutative, 

(ii) there exists an identity element 1 ^ 0 , 

(iii) there are no zero divisors. (That is, there do not exist elements a, b 
such that a / 0 and 6^0 but ab — 0 .) 
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The following are integral domains: 

all the fields listed in #1 above, 

Z (the set of all integers), 

R[X] (the set of all polynomials in X with coefficients from 
the field M), 

Z [X] (the set of all polynomials in X with coefficients from 
the integral domain Z). 

We will prove in §2d below that all fields are integral domains. 

#3 Other commutative rings 

There exist commutative rings with no zero divisors which are not integral 
domains because they do not have identity elements. Examples of this are 
2Z (the set of all even integers), 3Z (the set of all integers divisible by 3), 
and so on. In general, however, commutative rings are likely to have zero 
divisors. An example is: 

{(o “)b ie 4 

the set of all diagonal 2x2 matrices over M. Further examples are provided 
by the rings Z n (to be defined later). 

#4 Other rings 

Mat(2,M), Mat(3,M), ... (that is, square matrices of a given size over M) 
are examples of noncommutative rings. In fact, as we will see in the next 
section, Mat(n, R ) is a ring for any positive integer n and any ring R. Thus, 
for instance, Mat(5, 2Z) and Mat(4, M[X]) are examples of rings. One can 
also have matrices whose entries are matrices; thus Mat(2, Mat(2, M)) is the 
ring of all 2 x 2 matrices whose entries are 2x2 matrices over M. 

Of the different kinds of rings listed above, fields are the simplest, since 
all of the usual properties of multiplication are satisfied. Next come integral 
domains, which only lack multiplicative inverses, then arbitrary commutative 
rings, and finally arbitrary rings, which are the most complicated since very 
few requirements are placed upon the multiplicative structure. 

§2c Two ways of forming rings 

Our principal theoretical objective in this course is to understand field exten¬ 
sions; that is, relationships that hold between a field and larger fields having 
the given field as a subset. (As, for instance, Q C Con C M.) For this we 
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need to study various methods for constructing new rings from old ones, and 
this also increases our store of examples of rings. In this section we give two 
such methods of constructing rings. 

#5 Direct sums 

If R and S are rings we may define operations of addition and multiplication 
on the set V of all ordered pairs (r, s) with r £ R and s £ S. We define 

(ji, si) + (r 2 , s 2 ) = (ri + r 2 , s 1 + s 2 ) 

(ri,si)(r 2 ,s 2 ) = (rir 2 ,sia 2 ). 

(That is, the operations are defined “componentwise”.) With these opera¬ 
tions V is a ring, called the direct sum R + S of R and S. 

To prove that the direct sum of R and S is a ring it is necessary to 
prove that Axioms (i)-(vi) of Definition 2.2 are satisfied. In each case the 
proof that R + S satisfies a given axiom simply involves using the same axiom 
for R. and S. We prove only the first three axioms here, leaving the others 
as exercises. 

Proof, (i) Let a, b, c £ R + S. Then we have a = (ri,si), b = (r 2 ,s 2 ) 
and c = (r 3 , S 3 ), for some r 1 , r 2 , r 3 £ R and si, s 2 , S 3 £ S, and so 
(a + b) + c = ((ri, si) + (r 2 , s 2 )) + (r 3 , s 3 ) 

= (ri + r 2 , si + s 2 ) + (r 3 , S 3 ) (by the definition of addition 

in R + S) 

= ((?T + r 2 ) + r 3 , (s! + s 2 ) + s 3 ) (similarly) 

= (ri + (r 2 + r 3 ), Si + (s 2 + s 3 )) (by Axiom (i) for R and S) 

= (ri, si) + (r 2 + r 3 , s 2 + s 3 ) 

= (D, si) + ((r 2 , s 2 ) + (r 3 , s 3 )) 

= a + (b + c) 

as required. 

(ii) Let Of? and O 5 be the zero elements of R and S, and let a be any 
element of R + S. We have a — (r, s) for some r £ R, s £ S. Now 

a + (0#, O 5 ) = (r, s) + (0/j, Os) = (r + Or, s + 0 S ) = (r, s) = a, 
and similarly (0^, Os) + a — a. Thus (Os, Os) is a zero element for R + S. 

(iii) Let a be an arbitrary element of R + S. There exist r £ R and s £ S 
with a — (r, s), and if we let b — (—r, —s) then 

a + b = (r, s) + (-r, -s) = ((r + (-r)), (s + (-s))) = (Os, Os). 

Similarly b + a — (Os, Os), and since (Os, Os) is the zero element of R + S 
this shows that b is a negative of a. □ 
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#6 Square matrices 

Suppose that R is a ring and a±, a 2 , 03, ... are elements of R. The axiom 

(a + b) + c — a + (b + c) for all a, b, c G R 

shows that the expression a+b+c is well defined. It makes no difference which 
way parentheses are inserted: the additions can be done in either order. The 
same obviously must apply for any number of terms; so there is no ambiguity 
if the parentheses are omitted. That is, the expression a\ + <22 + • • • + a n is 
well defined. Moreover, the axiom 


a + b = b + a for all a, b E R 


shows that the order of terms is immaterial. So there is no harm in using 
sigma notation: ai + <22 + • • • + a n — °i- 

(CAUTION: Rings are not necessarily commutative; so the same kind of 
thing does not apply for multiplication. In the expression aia 2 ...a n the 
ordering of the factors must not be changed.) 


The distributive laws a(b + c) = ab + ac and (a + b)c = ac + be imply 

that 


and 


6(cii T cz.2 T ■ ■ ■ T fln) — b(i\ T ba 2 T • • • T ba n 
(ai + a 2 H-b a n )b = a\b + a 2 b H-b a n b , 


or, in sigma notation, 


and 


n n 

&£>) = ^ ba, 

i=l i =1 

n n 

(5 ~2 a i)b = y, aib. 

i= 1 i=1 


These formulae are known as the generalized distributive laws. 

It is also worth noting that it is legitimate to interchange the order of 
summation in double sums: 


£ 


(&-) 


min 

£ £• 

j =1 \i= 1 
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where the a l3 are ring elements (for i = 1,2 ,..., n and j — 1,2 ,..., m). 
This follows from the commutative law for addition—ring axiom (iv). 

If n is a positive integer let Mat(n, R) be the set of all n x n matrices 
with entries from R. If A G Mat(n, i?) and i, j G { 1 , 2 ,..., rz} let be 
the entry in the i th row and j th column of A. (Thus G R for each i, j.) 
Define addition and multiplication on Mat(n, R) by 

(A + B)ij = Aij + Bij 

n 

(AB)ij = Ajk.Bkj. 
fc =i 

That is, addition is defined componentwise, and for the product the ( i,j) th 
entry of AB is obtained by multiplying the i th row of A by the j th column 
of B in the usual way. 

It can be shown that, with these operations, Mat(n, R) is a ring. As 
with direct sums, the verification that Mat(n, R) satisfies a given axiom is, 
in most cases, a straightforward calculation based on the fact that R satisfies 
the same axiom. We will only do axioms (i) and (v) (which is harder than 
the others). 

Proof, (i) Let A, B, C G Mat(n, R). Then by the definition of addition 
in Mat(n, R) and the associative law for addition in R we have 


((A + B) + C)^ — (A + B)ij + Cij 
= ( A^ + B ^) + 

— A.^ + ( Bij + Cij ) 

= Aij + (B + C)ij 
= ( A+(B + C )) l0 

and therefore (A + B) + C — A + (B + C). 

(ii) Let A, B 1 C G Mat(n, R). Then 

n 

(( AB)C) tj = 

k= 1 

n / n 

— i AihBhk 

fc=1 \h=1 
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n / n 


EE (AihBhk) C'i 


kj 

k =1 \h =1 

n / n 

EE ( [-A-ihBhk ) Cfcj 

h=i \fc=i 

(by interchanging the order of summation) 

n / n 

I ( BhkCkj , 

/i=l \ /c= 1 

(by associativity of multiplication in i?) 


— ^ ^ I ^ ^ BhkCkj 

h=l \k=l 

n 

= Y,Aih(BC) hj 

h=l 

= (Mbc)) 13 

showing that ( AB)C = ^4(5(7), as required. 


□ 


§2d Trivial properties of rings 

Let R be any ring. By Axiom (ii) in Definition 2.2 we know that there is an 
element 0 G R satisfying 

($) a + 0 = 0 + a — a for all a e R. 

Could there exist another element z G R with the same property—that is, 
satisfying 

(<£) a + z — z + a — a for all a e R ? 

The answer is no. If z satisfies (£) then z — 0. To see this observe that 
putting a = z in ($) gives z + 0 = z, while putting a — 0 in (£) gives 
z + 0 = 0. Hence z must equal 0. 

In the preceding paragraph we have proved that the zero element of any 
ring is unique. There are a number of other properties which are trivially true 
in the examples of rings familiar to us, and which are equally easily proved 
to hold in any ring; some of these are listed in the theorems in this section. 
Although they are trivial, it is necessary to prove that they are consequences 
of the ring axioms if we wish to claim that they are true in all rings. 
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2.9 THEOREM Ill any ring R the zero element is unique, and each element 
has a unique negative. 

Proof. The first part has been proved above. For the second part, assume 
that a G R and that b, c E R are both negatives of a. Using Definition 2.2 
we have 

6 = 6 + 0 (by Axiom (ii)) 

= 6 + (a + c) (since c is a negative of a) 

= (6 + a) + c (Axiom (i)) 

= 0 + c (since 6 is a negative of a) 

= c (Axiom (ii)). 

Thus a cannot have two distinct negatives, which is what we had to prove. 

□ 

In view of the preceding theorem there is no ambiguity in denoting the 
negative of an element a by ‘—a’, as usual. It is customary also to write 


‘ x — 

y ’ for ‘ x + (—y) 


2.10 

Theorem Let R be any ring and 

a, b, c E R. 

0 ) 

If a + b — a + c then b — c. 


(») 

-(a + 6) = (-a) + (-6). 


(in) 

— (—a) = a. 


(iv) 

aO = 0 a = 0 . 


(y) 

a(—6) = — (a&) = (—a)6. 


(vi) 

(—a)(—6) = ab. 


(vii) 

a(b — c) — ab — ac. 


Proof, (i) Assume that a + 6 = a + c. 

Then we have 


(-a) + (a + 6) = (-a) + (a + c) 

((—a) + a) + 6 = ((—a) + a) + c (Axiom (i)) 

0 + 6 = 0 + c (by definition of ‘—a’) 

b — c (Axiom (ii)) 

as required. 

(ii) In view of the uniqueness of negatives it is sufficient to prove that 
(—a) + (—6) is a negative of a + 6; that is, it is sufficient to prove that 


((—a) + (—6)) + (a + 6) — 0 — (a + 6) + ((—a) + (—6)). 
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Furthermore, if we prove only the first of these equations, the other will follow 
as a consequence of Axiom (iv) in Definition 2.2. But use of the first four 
axioms readily gives 

((-a) + (-6)) + (a + b) — (-a) + ((-6) + (a + h)) 

= ( -a ) + (( - k) + (b + «0) 

= ( -a ) + ((( - k) + b) +a) 

— (—a) + (0 + a) 

= (-a) + a 
= 0 . 

(iii) By the definition, a negative of —a is any element b which satisfies 
(—a) + b — 0 — b + (—a). But these equations are satisfied if we put b — a; 
so it follows that a is a negative of —a. (In other words, the equations that 
say that x is a negative of y also say that y is a negative of x.) Since negatives 
are unique, we have proved that —(—a) = a. 

(iv) aO + 0 = aO (Axiom (i)) 

= a(0 + 0) (Axiom (ii)) 

= aO + aO (Axiom (vi)). 

By part (i) above it follows that aO = 0. The proof that 0a = 0 is similar. 
The proofs of the other parts are left to the exercises. □ 

Our final result for this chapter gives a connection between two of the 
concepts we have introduced. 

2.11 THEOREM Every field is an integral domain. 

Proof. Comparing the definitions of ‘field’ and ‘integral domain’ we see 
that this amounts to proving that there can be no zero divisors in a field. So, 
let F be a field and let a 6 F be a zero divisor. Then by definition a ^ 0 and 
there exists b E F with 6^0 and ab = 0. But since F is a field all nonzero 
elements have multiplicative inverses; in particular there exists c E F such 
that ca — 1. Thus 


b — lb — ( ca)b = c(ab ) = cO = 0 


contradicting the fact that 6^0. So F can have no zero divisors. 


□ 
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Exercises 

1 . Prove parts (v), (vi) and (vii) of Theorem 2.10. 

2 . Prove that Ring Axioms (iv), (v) and (vi) hold in the direct sum R + S 
of two rings R and S. 

3. Prove that Ring Axioms (ii), (iii) and (iv) are satisfied in Mat(n,i?). 

(Hint: The zero element of Mat(n, R) is the matrix all of whose 
entries are zero, and the negative of a matrix A is the matrix 
whose entries are the negatives of the entries of A.) 

4. Prove that Ring Axiom (vi) is satisfied in Mat(n,A). 

5. Let A be any set and R any ring, and let if (A. R) be the set of all 
functions from A to R. Let addition and multiplication be defined on 
JP(A, R) by the rules 

(f + g)(a) = f(a) +g(a) 

(fg)(a) = f( a )g( a ) 

for all /, g E JP(A, R) and a E A. Prove that with these operations 
T(A, R) is a ring. 

6 . Suppose that e\ and e 2 are both identity elements in the ring R. Prove 

that ei = e 2 - (Hint: Consider eie 2 .) 

7. Let R be a ring with an identity element 1. Prove that an element a E R 
can have at most one multiplicative inverse. 

8 . ( i ) Is the equation a 2 — b 2 — (a — b) (a + b ) valid in all rings? 

(ii) Let R be a commutative ring and let x and y be elements of R hav¬ 
ing the property that x 2 — 0 and y 2 = 0. Prove that (x + y) 3 — 0. 

(iii) Give an example of a (nonconnnutative) ring R having elements x 
and y such that x 2 — 0 and y 2 — 0 but (x + y) 3 ^ 0. 

(Hint: The matrix x — ^ satisfies x 2 — 0.) 

9. Suppose that R is a commutative ring and a E R is nonzero and not a 
zero divisor. Prove that if 6, c G R satisfy ab — ac then b — c. 
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10. By imitating the construction in #5, describe how to construct the direct 
sum A + B + C of three rings A, B and C. 

11. A ring R is said to be Boolean if a 2 — a for all a G R. Prove that if R 
is Boolean then 2a = 0 for all a e R. Prove also that all Boolean rings 
are commutative. 


3 

The integers 


In this chapter we will investigate divisibility and factorization in the ring Z. 
These properties will be used in the next chapter in the construction of the 
rings Z n , our first example of the important concept of a quotient of a ring. 
Our treatment of Z and its quotients will be mimicked later in our discussion 
of polynomial rings and their quotients, which are of central importance in 
field extension theory. 


§3a Two basic properties of the integers 

We begin with a property of the set Z + of positive integers which is equivalent 
to the principle of mathematical induction. It should be regarded as an 
axiom. 

3.1 Least Integer Principle Every nonempty set of positive integers 
has a least element. 

Comment »> 

3.1.1 To convince yourself that the principle of mathematical induction 
follows from 3.1, it is worthwhile to try rewriting a simple proof by induction 
as a proof by the Least Integer Principle. The idea is this. Suppose we wish 
to prove that some statement P(n) is true for all positive integers n. We 
check first that P( 1) is true. Now let S be the set of all positive integers for 
which P(n) is not true; we aim to prove that S is empty. If it is not then by 

3.1 it has a least element, k, and k > 1 since 1 ^ S. Thus A: — 1 is positive 
and not in S’, so P{k — 1) is true. If we can prove that P(n) is true whenever 
P(n — 1) is true it will follow that P(k) is true, contradicting k e S, and 
thereby showing that S — 0. Thus although we have appealed to 3.1 rather 
than induction, our task is the same: prove that P(l) is true and prove that 
P{n) is true whenever P{n — 1) is true. 
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As an illustration of this, let us rewrite the well known inductive proof 
that Y^i=i * 3 = (l/4)n 2 (n + l ) 2 as a proof which appeals to 3.1 instead of 
induction. (We have no use for this formula in this course, but we will make 
use of the Least Integer Principle to prove other properties of Z.) 

Let S — {n G Z + | Y27=i i^ (l/4)n 2 (n + l ) 2 } (the set of all in¬ 
tegers for which the given formula is false. Our aim is to prove that S is 
empty. Using indirect proof (see §0d), assume that S ^ 0. By 3.1 it fol¬ 
lows that S must have a least element. Let k be this least element. Since 
— l 3 — (1/4)1 2 (1 + l ) 2 we see that 1 ^ S, and so k ^ 1. Since 
k G Z + and k ^ 1, it follows that k — 1 is also a positive integer. Since 
k — 1 < k and k is the least positive integer in S, it follows that k — 1 ^ S. 
Thus Yli=i i 3 = (1/4)(/c — l) 2 k 2 . Adding k 3 to both sides of this gives 

E-=i * 3 = (i/4)(£;-i ) 2 k 2 + k 3 

= (l/4 )((& 2 - 2k + l)k 2 + 4k 3 ) 

= (l/4)(/c 4 — 2k 3 + k 2 + 4/c 3 ) 

= (l/4)(/c 4 + 2k 3 + k 2 ) 

— ( 1 /4)/c 2 (/c + l) 2 , 

and by the definition of S we deduce that k ^ S. But this is a contradiction, 
since k G S by the definition of k. This contradiction completes the proof 
that S is empty, which is what we had to prove. t»> 

The illustrative proof in 3.1.1 above is somewhat convoluted, and is 
more naturally expressed as a direct proof by induction, in the usual way. In 
other cases, however, the Least Integer Principle may be more natural and 
easier to use than induction. Indeed, our next proof is a case in point. We 
use 3.1 to prove the Division Theorem for integers: 

3.2 THEOREM If a G Z and n G Z + then there exist unique integers q and 
r with a — qn + r and 0 < r < n. 

Proof. Since r has to be a — qn , the theorem can be rephrased as follows: 

/ \ If a G Z and n G Z + then there exists a 

(*) 

unique integer q such that 0 < a — qn < n. 

Given a and n, we first prove the existence of such a q. 
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Let S — { m G Z + | m — kn — a for some k 6 Z }. If we put k = a 2 + 1 

then 

kn — a — (a 2 + l)n — a>a 2 + l — a> (a — -) 2 > 0 

and so kn — a e S'. Hence S' 7 ^ 0, and by 3.1 it follows that S has a least 
element. 

Let fco 6 Z be chosen so that k^n — a is the least element of S. Then 
we have 

(1) kon — a > 0. 

Moreover, since (ko — l)n — a < k 0 n — a it follows that (ko — 1 )n — a ^ S, 
and therefore 

( 2 ) (ko — 1 )n — a < 0 . 

Combining (1) and (2) yields 

(ko — 1 )n < a < kon, 

and, on subtracting (ko — 1 )n throughout, 

0 < a — qn < n 

where q — ko — 1. This establishes the existence part of (*). 

To prove the uniqueness assertion we must show that if q, q' are integers 
satisfying 0 < a — qn < n and 0 < a — q'n < n then q — q'. 

Assume that q, q' are such integers. Then 

(q — q')n — (a — q'n ) — (a — qn) < n 

since a — q'n < n and a — qn > 0. Similarly, 

(7/ — q)n — (a — qn) — (a — q'n) < n. 

So we obtain 

—n < (q — q)n < n, 
and, on dividing through by n, 

—l<q — q'<l. 

Since q and q' are integers it follows that q — q'. as required. □ 
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Comments »> 

3.2.1 The integers q and r in 3.2 are called the quotient and remainder 
when a is divided by n. 

3.2.2 If the remainder r in 3.2 is zero, so that a — qn , we write ‘n|a’, 
which should be read as ‘n is a factor of o’, or ‘n divides a’ (short for l n 
divides a exactly’), or ‘n is a divisor of a’, or ‘a is a multiple of n\ 

3.2.3 The integer 0 is divisible by all integers, since the equation 0 = On 
is valid for all n. 

3.2.4 Observe that if a is an integer then the set of all divisors of —a is 

the same as the set of all divisors of o. >t» 


§3b The greatest common divisor of two integers 

3.3 Theorem If a and b are integers which are not both zero then there 
is a unique positive integer d such that 

(a) d\a and d\b, 

(b) if c is an integer such that c\a and c\b then c\d. 

Furthermore, there exist integers m and n such that ma + nb = d. 

Comment »t> 

3.3.1 Part (a) says that d is a common divisor of a and b , while part (b) 
says that any other common divisor of a and b is a divisor of d. So d is the 
greatest common divisor of a and b: 

d — gcd (a,b). 

(A common notation is just d — (a, b). Some authors prefer the nomenclature 
‘highest common factor’ to ‘greatest common divisor’.) t»> 

We give two proofs of the existence of a d with the properties described 
in Theorem 3.3. The first, which is based on the Euclidean Algorithm, also 
provides a method for calculating d. 

If a and b be integers which are not both zero, define D — D(a, b) to 
be the set of all common divisors of a and b: 

D(a,b) — {c G Z | c|a and c\b}. 

Our aim is to prove that there exists d G D such that d is divisible by all 
elements of D , and the strategy is to replace a and b by integers with smaller 
absolute value without changing D. 
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Observe first, that by 3.2.4 we may replace a and b by |a| and | 6 | without 
changing the set of common divisors. So we may assume that a > 0 and b > 0. 
The next observation (Lemma 3.4 below) is that the set of common divisors 
is unchanged if a is replaced by another integer which differs from a by a 
multiple of b. The Euclidean Algorithm uses this fact repeatedly to replace 
a and b by smaller numbers until one of them becomes zero. 

3.4 Lemma If a, b, m G Z then D(a, b) = D(b, a + mb). 

Proof. Suppose that c G D(a , b). Then a — rc and b — sc for some integers 
r and s. Hence a + mb = (r + ms)c, and so c is a divisor of a + mb. Since 
c is also a divisor of b it follows that c G D(a + mb,b). This shows that 
D(a, b) C D(a + mb , b). 

Suppose, on the other hand, that e G D(a + mb , b). Then a + mb — te 
and b — ue for some integers t and u, and this gives a — (t — mu)e. Hence e 
is a divisor of a as well as of 6, and so e G D(a, b). So D(a + mb , b) C D(a , b). 

□ 

To find the greatest common divisor of two nonegative integers a and 6 , 
proceed as follows. Without loss of generality we may assume that a > b. If 
b ^ 0 let b' be the remainder on division of a by b, and define a' — b. Since 
b' — a — qb for some q, Lemma 3.4 gives 

D = D(a, b) — D(b , a — qb) — D(a , b'). 

So D is unchanged, and a’ and b' are smaller than a and b were. If 1/ ^ 0 
we can replace a by a' and b by b' and repeat the process. Eventually, after 
repeating the process often enough, the smaller of the two numbers will be 
zero. Let d be the other number. Then the set D, which is unchanged 
throughout, is equal to 

D(d, 0) = { c G Z | c\d and c|0 } = { c G Z | c\d }. 

So the set of all common divisors of the two numbers a and b that we started 
with equals the set of all divisors of d. In particular, d itself is the greatest 
common divisor of a and b, divisible by every other common divisor. 

We can conveniently paraphrase the above description of the algorithm 
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using terminology based on various computer programming languages: 

Euclidean Algorithm. 

while b ^ 0 do 

[a, 6 ] := [ 6 , a — b * (a div 6 )] 

enddo 
return a 


The sign in this means “becomes”. The pair of numbers a and b 
are replaced by b and a — qb respectively, where q the quotient on dividing a 
by 6 , and this is repeated until b — 0. The program then returns the other 
number, which is the gcd of the initial two. 

Using mathematical terminology of a more conventional kind, let a± — a 
and o 2 = 6 , and (recursively) define oq + i to be the remainder on dividing 
aj_i by di, as long as cq ^ 0. This results in the following formulae: 

«i = <?3 a 2 + 03 0 < a 3 < a 2 

0.2 = Q 4 O 3 + O 4 0 < O 4 < O 3 


Ofc —2 qk&k—l + Ofc 0 <C Ofc <C O/j—i 

Ofc-i = qk+icik Ofc+i = 0 . 

Since (by 3.2) the remainder on dividing Oj_ 1 by Oj is a nonnegative integer 
less than a*, the sequence a 2 , 03 , ... is a strictly decreasing sequence of 
nonnegative integers. Any such sequence must eventually reach 0. So the 
process must terminate. Now if D — £>( 01 , 02 ) then repeated application of 
3.4 gives 

D = £>(o 2 , a 3 ) = D(a 3: a 4 ) = ■ ■ ■ = D(a k , a k+1 ). 

But D(a k , ajt+i) = £>(afc, 0 ) is just the set of all divisors of a k , and so putting 
d — a k we conclude immediately that d G D and c\d for all c G D. That is, 
(a) and (b) of 3.3 are satisfied. Summarizing: 

,, ^ j The gcd of two integers is the last nonzero 

remainder obtained in the Euclidean Algorithm. 
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The equations above also show how to express gcd(ai,a 2 ) in the form 
mai + na 2 with to, n G Z. The idea is that aq and 02 are trivially expressed 
in this form, since we obtain 

Cll = TOiCii T 77-10-2 

02 = TO 2 O 1 + 77202 

if we define to 1 = 1, 771 = 0 and to 2 = 0 , 772 = 1, and the equations from 

the algorithm permit one to successively express 03, 04, ..., and eventually 
Ofc = gcd(oi, 02 ), in the required form. Specifically, if we have expressions 
for o,;_ 1 and Oq 

Oi-l = TOi-iOi +77i_i0 2 
O^ 777^.01 “j - 77^.02, 

then the equation 

Ct-i-l — Qi+ldi + Ui+l 
(from the algorithm) gives 


Ui+l — Q-i —1 

= (TOj-iOi + 77i_i0 2 ) - <?i+l(?77i0 1 + 77i0 2 ) 

= n7i_|_iOi + 77j_|_i02, 

where 

m i+ 1 = TOi-i - ft+l?77; 

77j_|_i 77j_i Qj_|_i?7j. 

Thus we have an expression for Oj+i and can repeat the process. This even¬ 
tually yields an expression of the required kind for a*, = gcd(oi, 02 ). 


Thus the full algorithm, for finding the gcd of Oo and bo and expressing 
it in the form tooq + 77& 0 , is as follows: 
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At the end of the process, a is the gcd of ao and bo, and m and n satisfy 
a — mao + n bo- 

We have now proved most of Theorem 3.3, but we have still to prove 
the uniqueness of the gcd. For this, assume that di and c /2 are both gcd’s of 
a and b ; that is, (a) and (b) of 3.3 are satisfied with d replaced by di and 
also with d replaced by • Now since d\\a and d\\b (by (a) for d\) it follows 
from (b) for d 2 that d\\d 2 - Hence d\ < c^- But exactly the same reasoning, 
using (a) for d 2 and (b) for d\, gives < d\. So d\ = e^- 

- Example - 

#1 Compute gcd(84,133) and find integers m and n such that 

gcd(84,133) = 84m + 133n. 

^—> The steps in the Euclidean Algorithm are: 


( 1 ) 

133 = 1 x 84 + 49 

( 2 ) 

84 = 1 x 49 + 35 

(3) 

49 = 1 x 35 + 14 

(4) 

35 = 2 x 14 + 7 

(5) 

14 = 2 x 7 


The last nonzero remainder is 7, and so 7 = gcd(84,133). 

Now (1) gives 

(6) 49 = 133 — lx 84 

and substituting this into (2) we obtain an expression for 35 as a combination 
of 84 and 133: 

(7) 35 = 84 - 1 x 49 

= 84 -(133 - 1 x 84) 

= 2 x 84 - 133. 

Substituting (6) and (7) into (3) gives an expression for 14: 

(8) 14 = 49 - 1 x 35 

= (133 - 1 x 84) - (2 x 84 - 133) 

= 2 x 133 - 3 x 84. 


Substitute (7) and ( 8 ) into (4): 
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(9) 7 = 35 — 2 x 14 

= (2 x 84 - 133) - 2(2 x 133 - 3 x 84) 
= 8 x 84 - 5 x 133. 

So m — 8, n — — 5 is a solution. 


Now for the second proof of the existence of the gcd: 

Proof. Let a and b be integers which are not both zero, and let 

S — { ma + nb | m, n G Z } fl Z + . 

Since a 2 + b 2 > 0 we see that a 2 + b 2 G S, and so, by 3.1, S has a least 
element. Let d be this least element. Then certainly there exist rri. n G Z 
with d — ma + nb. 

Let d! be the remainder on dividing a by d. Then 0 < d! < d, and, for 
some q G Z, 

d! — a — qd 

= a — q(ma + nb) 

= (1 — qm)a + (—qn)b. 

If d! ^ 0 this shows that d! G S, contradicting the fact that d is the least 
element of S. Hence d' — 0; that is, d\a. The same reasoning with a replaced 
by b shows that d\b. □ 


§3c Factorization into primes 

3.5 Definition A nonnegative integer is said to be prime if it is strictly 
greater than 1 and has no positive factors other than itself and 1. 

3.6 Proposition If a\bc and gcd (a,b) = 1 then a\c. 

Proof. Since gcd (a. h) — 1 there exist (by Theorem 3.1) integers m and n 
such that ma + nb — 1. Multiplying through by c gives c[ma) + c(nb) — c, 
which can be rewritten as (cm)a + ( bc)n — c since multiplication in Z is 
commutative. Now both terms on the left hand side are multiples of a (since 
we are given that a|6c). Hence a|c. □ 
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3.7 Theorem Each integer n > 1 can be expressed as a product of primes. 
(That is, n — P1P2 ■ ■ .p r for some r > 1 and primes Pi, P2, ■ ■ ■, Pr £ Z + . y ) 

Proof. Assume that the above statement is false. By the Least Integer 
Principle there exists a least integer n > 1 which is not expressible as a 
product of primes. Then n itself is not prime (otherwise we could take r = 1 
and pi — n); so n = n\U2 with 1 < n\ < n and 1 < n2 < n. By the 
minimality of n it follows that n i and n.2 are both expressible as products of 
primes. Hence so is n, contradiction. □ 

3.8 Theorem The product in 3.7 is unique up to the order of the factors. 
In other words, if pi, P2, ■ ■ ■, Pr and q \ . qz ■ ■ ■ ■, q s are positive prime integers 
such that P1P2 .. .p r = qi f/2 • • • Qs then r — s, and, for some permutation o of 
{1, 2 ,..., r}, we have p t = q„ {l) for i = l, 2, ..., r. 

Proof. We have pi\piP 2 ■ ■ - Pr — <?i(? 2<?3 • • • Qs)- Observe that since the only 
positive divisors of a prime are itself and 1 , two distinct primes can have no 
positive common divisors other than 1 . So if pi 7 ^ q\ then gcd(pi,(/i) = 1, 
and by 3.6 it follows that Pi | <? 2<73 • • • q s - If Pi 7^ ?2 the same argument gives 
pi \qs ... q s . Thus if pi 7 ^ qj for each j we get successively pi\q 2 q 3 • • ■ <? s , 
Pi \q3q4 .. ■ q s , pi\q 4 ... q 8 , ..., and eventually Pi\q s - But this is contrary 

to the assumption that pi ^ q 3 for each j. So pi — q n for some ji- Now 
cancelling gives P2P3 ... p r — qi ■ ■ ■ Qj-i-iQn+i ■ ■ ■ Qs, and repeating the argu¬ 
ment gives P 2 — qj 2 for some (]2 7 ^ j\ ■ We can continue in this way cancelling 
factors until one side or the other is reduced to 1. But since 1 cannot equal 
a product of primes greater than 1 it follows that when all the factors have 
been cancelled from one side all the factors have been cancelled from the 
other side too. So we must have r — s and pi — qj i for 7 = 1 , 2 ,... r, 
where j\, j 2 , ..., j r are all distinct—that is, pi = q a u\ where cr(i) — ji is a 
permutation of { 1 , 2 ,..., r}. □ 


- Example - 

#2 Prove that \/3 is irrational. 

^—> We prove first that if a and b are integers such that a 2 — 36 2 = 0 then 
a — b — 0 . 

Suppose to the contrary that there exists a nontrivial integral solution 
to o 2 — 36 2 = 0. Replacing a and b by their absolute values we may assume 
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that a and b are both in Z + . By Theorem 3.7 both a and b can be expressed 
as products of primes; say 


a = P 1 P 2 ■■■Pn and b = qiq 2 ...q m - 

Now a 2 — 3 b 2 gives 

(**) PlP2 ■■ -Pn — Qm- 

But by Theorem 3.8 prime factorizations are unique up to ordering of the 
factors; hence the number of times 3 occurs on the left hand side of (**) 
equals the number of times it occurs on the right hand side. But 3 occurs an 
even number of times on the left hand side (twice the number of i such that 
Pi — 3) and an odd number of times on the right hand side (one plus twice 
the number of j such that <jj — 3). This contradiction shows that no such a 
and b exist. 

We can now see that \/3 is irrational, for if y/3 — a/b with a, b G Z 
then b 7 ^ 0 and a 2 — 3 b 2 — 0. But the Lemma shows that this is impossible. 

Of course, similar proofs apply for \/ 2 , \/5. y/ 6 , \pi, and so on. 


Exercises 

1. In each case compute the gcd of the given integers a and b and find 
integers m and n such that gcd (a, b) — ma + nb\ 

(z) a = 420, b = 2079 (ii) a = 1188, b = 4200. 

2. Prove that is irrational. 

3. Show that if a\b and b\c then a|c. 

4. Show that if a\r and b\s then gcd(a, 6)| gcd(r, s). 

5. Let a, b and c be integers, and let d — gcd(a, gcd(6, c)). Show that 

d = gcd(gcd(a, b), c). Show also that d is the largest positive integer 

which is a divisor of all of a, b and c, and that there exist integers l, m 
and n with d — la + mb + nc. 
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6 . Let a, b be positive integers and M the set of positive integers which 
are multiples of both a and b. The least element of M is called the least 
common multiple of a and b and is denoted by lcm(a, b). 

(z) Show that if a\c and b\c then ab\cd, where d = gcd(a, b). 

( ii ) Show that lcm(a, b) = a&/gcd(a, b). 

7. Let a, b and e be integers and let d = gcd(a. b). Prove that there exist 
integers m and n such that ma + nb — e if and only if e is a multiple 
of d. 

8. (z) Find an integral solution (■ m,n ) to the equation 

4641m + 2093n = 364. 

(ii) Prove that there is no integral solution to the equation 

91m + 63n = 6 . 

9. Let a and b be integers, and let m — mo, n = no be an integral solution 
to the equation 

($) ma + nb — d 

where d — gcd(a, b). Prove that for any k e Z 

m = mo + k(b/d ) 
n — no — k(a/d ) 

is another solution to ($). Prove also that every integral solution to ($) 
has this form. 

10 . Let m be a positive integer. Show that there exist unique integers 
Oo, aq, ..., a r such that a r ^ 0 , 

m = oo + 8 ai + 8 2 a 2 + • • • + 8 r a r 

and 0 < ai < 8 for i = 0 , 1 , ..., r. 

(Hint: Use the Division Theorem repeatedly.) 




Quotients of the ring of integers 


Our primary objective in this chapter is the construction of the rings Z n . 
We start with a preliminary section which will also be needed later in our 
discussion of quotient rings in general. 


§4a Equivalence relations 

Sometimes when considering elements of some set S it is convenient to lump 
together various elements of S if they are equivalent to one another, by some 
criterion of equivalence. For example, if S is the set of all cars in Sydney we 
may wish to regard two elements of S as equivalent if they are of the same 
make. (That is, all Holdens are equivalent, all Fords are equivalent, and so 
on.) Obviously the set of all equivalence classes will be much smaller than 
the set S itself. One can easily invent various other equivalence relations, 
but to justify the term ‘equivalence’ the following properties should hold: 

(i) Every element should be equivalent to itself. 

(ii) If a is equivalent to b then b should be equivalent to a. 

(iii) If a is equivalent to b and b is equivalent to c then a should be equivalent 
to c. 

4.1 DEFINITION Let ~ be a relation on a set S. That is, for every pair a, 
b of elements of S either a ~ b (a is related to b) or a 7 ^ b ( a is not related to 
b). Then ~ is called an equivalence relation if the following hold for all a, 6 , 

ce S: 

CL r ^ J a. (reflexive law ) 

(ii) If a ~ b then b ~ a. ( symmetric law ) 

(iii) If a ~ b and b ~ c then a ~ c. ( transitive law ) 

If ~ is an equivalence relation on the set S and a € S define 

a — {6 G S | b ~ a}. 


42 
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That is, a is the subset of S consisting of all elements equivalent to a. The 
subset a is called the equivalence class of a. 


4.2 Theorem If ~ is an equivalence relation on S and a, h <G S then 

(i) a — b if and only if a ~ b, 

(ii) if a yb b then a fl b = 0 . 


Proof, (i) Suppose that a ~ b. By the Symmetric Law we also have that 
b ~ a. Now for x £ S the Transitive Law gives us the following facts: 

, . If x ~ a then x ~ a and a ~ 6 ; so x ~ b. 

' ' If x ~ b then x ~ b and b ~ a; so x ~ a. 

By (*) we see that x ~ a if and only if x ~ b. Hence 

a = {:c€S , |x~a} = {x£S , |x~&} = & 


Conversely, suppose that a — b. Since a ~ a we have 


Hence 


a£{x£S'|x~a} = a. 


a e 6 = {x e S 1 1 ~ 


So a ~ b. 

(ii) Suppose that a ^ b and a fl b ^ 0. Let c € a fl b. Then we have c E a 
and c G 6 ; so c ~ a and c ~ b. By the Symmetric Law we deduce that a ~ c 
and c ~ b, so that the Transitive Law gives a ~ 6 , contradicting our initial 
assumption. □ 

Comment »> 

4.2.1 Theorem 4.2 shows us that the equivalence classes form a partition 
of the set S —that is, every element of S lies in exactly one equivalence class. 

>t» 


We now define 

S = {a \ a E S}. 

That is, S is the set of all equivalence classes of elements of S. 

By 4.2, a and b are the same if a ~ b. So when we deal with equivalence 
classes we are, so to speak, amalgamating equivalent elements. Intuitively, 
we pretend that we cannot tell the difference between equivalent elements. 
Then all equivalent elements combine to be one single element (which, strictly 
speaking, is an equivalence class) of a set S which is smaller than our original 
set S. 
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4.3 Definition The set S defined above is called the quotient of S by the 
equivalence relation ~. 


§4b Congruence relations on the integers 

Throughout this section let n be a fixed positive integer. 

4.4 Definition Let = be the relation defined on Z by the rule 

a = b if and only if a — b is a multiple of n. 

The relation = is called congruence modulo n. 

Comment »> 

4.4.1 We usually write ‘ a = b (mod n) ’ rather than just ‘ a = b\ unless 
there is no possible confusion. »> 

For example, 11 = 3 (mod 8 ), and —6 = 40 (mod 23), and so on. 

4.5 Theorem (i) Congruence modulo n is an equivalence relation on Z. 
(ii) For every integer m there is exactly one integer r in the range 0 < r < n 
such that m is congruent to r modulo n. 

Proof, (i) Let a, b, c G Z be such that a = b (mod n) and b = c (mod n). 
Then, by the definition, n\(b — a) and n| (c — b). That is, 

b — a — nr and c — b = ns 


for some r, s E Z. Now 


c — a — (c — b) + (fb — a) 

— ns + nr 

— n(s + r). 

So n|(c — a), and we have proved that congruence modulo n is a transitive 
relation. 

The proofs that congruence is reflexive and symmetric are left to the 
exercises. 

(ii) By definition, m = r (mod n) if and only if n\(m — r). That is, m = r 
(mod n) if and only if m — r — qn for some q G Z. But by Theorem 3.2 the 
equation m — qn + r has exactly one integral solution with 0 < r < n. □ 
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Comments »> 

4.5.1 By Theorem 4.5 the equivalence classes 0, 1, 2, ..., n — 1 are all 
the equivalence classes for the relation congruence modulo n, and these classes 
are all distinct from one another. 

4.5.2 We will use the notation ‘Z n ’ rather than ‘Z’ for the set of all 

equivalence classes. The equivalence classes for the congruence relation are 
usually called ‘congruence classes’. »> 


§4c The ring of integers modulo n 

Let n be a fixed positive integer. In the last section we defined the equivalence 
relation ‘congruence modulo n' on Z and defined Z n to be the set of all 
congruence classes. By Theorem 4.5 

Z n = {0, T, 2,... ,n - 1} 

(where by definition r = {mGZ|m = r (mod n) }). Intuitively, we have 
amalgamated into one object all integers which leave the same remainder on 
division by n. For example, if n — 5 we have 

Z 5 = {0,1,2, 3,4} 

where 

0 = {---, -5, 0, 5,10,...} 

1= {...,-4,1,6,11,...} 

2 = {---, -3, 2, 7, 12,... } 

3 = {...,-2,3, 8,13,...} 

4 = {...,-1,4, 9,14,...}. 

Intuitively, the integers ..., — 6 , —1, 4, 9, 14, 19, ... become a single object, 
as do ..., —2, 3, 8 , ... and so on. That is, these numbers are all regarded 
as equal when working modulo 5. What this means strictly is that 

... = ^6 = ^T = 4 = 9 = ■■■ 

and so on. In other words we have many different names for the same con¬ 
gruence class. 

Observe that in the above example the sum of any element in set 3 and 
any element in the set 4 gives an element in the set 7 — 2. Thus, for instance, 
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—2 G 3, 19 G 4, and — 2 + 19 = 17 G 2. So it seems reasonable to define the 
sum of the sets 3 and 4 to be equal to the set 2. That is, 

3 + 4 = 2. 


Similarly the product of any element in the set 2 and any element in the set 
3 gives an element in the set 1. For example 

2 x 3 = 6 G I, 7 x (-2) = -14 G I, (-8) x 13 = -104 G T. 

So we define _ _ _ 

2x3 = 1. 


This suggests the following general rule for addition and multiplication of 
congruence classes. To add (or multiply) two classes pick one element from 
each class and add (or multiply) the elements. The congruence class in which 
the answer lies is then defined to be the sum (or product) of the two given 
classes. We must show, however, that you get the same answer whichever 
elements you choose. Fortunately that is not hard to prove. 


4.6 Lemma Let n be a positive integer. If a, a', 6, b' are integers such that 
a = a' (mod n) and b = b' (mod n) then 

a + b = a' + b' (mod n) and ab = a'b' (mod n). 


Proof. Assume that a = a' and b = b' . Then n\a — a' and n\b — b' . That is, 
a' = a + rn and b' = b + sn for some r, s G Z. 


This gives 

a' + b' = (a + rn) + (b + sn) = (a + b) + (r + s)n = a + b (mod n) 
and similarly 

a'b' = ab + (rb + as + rsn)n = ab (mod n). 


□ 


Comment »> 

4.6.1 In view of 4.2, an alternative formulation is this: 

If a = a’ and b — b' then a + b — a' + b' and ab — a'b'. 


>» 


4.7 Theorem Addition and multiplication can be dehned on 7h n in such a 
way that a + b = a + b and ab — ab for all a, b G Z. 

Proof. Given a and (3 in Z„ there are uniquely determined integers r and 
s such that 0<r<n, 0 < s < n and r — a, s = (3. Define a + (3 — r + s 
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and a/3 — rs. We have now defined addition and multiplication on Z n : it 
remains to check that the formulae in the theorem statement are satisfied. 

Let a, h G Z, and let r, s G Z be such that 0 < r < ro, 0 < s < n and 
r — a, s — b. By the definitions just given, a + b — r + s and ab — rs, and 
by 4.6.1 the required result follows. □ 

It is now trivial to verify that these operations of addition and multi¬ 
plication on Z n satisfy the ring axioms. 

4.8 Theorem The set Z n forms a ring with respect to the operations of 
addition and multiplication as defined in 4.7. 

Proof. It is necessary to check all of the axioms (i)-(vi) of Definition 2.2. 
In each case one simply appeals to the same property of Z and the formulae 
in 4.7. We will do only the first three axioms, leaving the others as exercises. 

(i) Let a, /3, 7 G Z n . Then there exist a, 6 , c G Z with a = a,b ~ /3, c — 7 , 
and we have 


a + (3) + 7 = (a + b) + c 


=a+b+c 

(by the definition of addition in Z n ) 

— (a + b) + c 

(similarly) 

= a + (b + c) 

(since addition in Z is associative) 

= a + b + c 

(by the definition of addition in Z n ) 

= a+ (b + c) 

(similarly) 

= at + (P + i). 



(ii) We will prove that 0 is a zero element in Z n . Let a be any element of 
Z n , and choose any a in Z with a = a. Then 


a + 0 = a + 0 = a + 0 = a = <a. 


Similarly, 0 + a — a. 

(iii) Let a G Z n and let a G Z with a = a. Then 

a -\—a = a -t —a = a + (— a ) = 0 

and similarly —a + a — 0. Since 0 is the zero element of Z n this shows that 
—a is a negative of a. □ 
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§4d Properties of the ring of integers modulo n 

Since multiplication in Z is commutative we have in Z n that 

(1\ 0,2 — Q-l0-2 — CL2tti — G.2 Q-l 

for all ai, 02 G Z. Since every element of Z n is of the form a for some a G Z 
this shows that multiplication in Z n is commutative. We also have 

la = la = al = al 

for all a G Z, and it follows that 1 is a multiplicative identity for Z n . Thus 
we have proved 

4.9 Theorem The ring Z n is commutative and has an identity element. 

The ring Z itself, as well as being commutative and having an identity 
element, has the property that there are no zero divisors. Hence Z is an 
integral domain. However the ring Z n usually has got zero divisors. For 
instance in Zg we have 3^0 and 2^0 but 3 x 2 = 6 = 0. Hence we see 
that Z n is not generally an integral domain. A little thought shows that it 
is not possible to find zero divisors in Z n in this way if n is prime. In fact: 

4.10 THEOREM The ring Z n is an integral domain if and only if n is prime. 

Proof. If n is not prime then there exist r, s G Z with l<r<n, l<s<n 
and rs — n. This gives 

rs — rs — ft — 0 

although r ^ 0 and 0. Since 0 is the zero element of Z n it follows that 
Z n has zero divisors and is therefore not an integral domain. 

On the other hand, suppose that n is prime and suppose that rs — 0. 
Then rs — 0; that is, rs = 0 (mod n). So n is a divisor of rs. But if 
r and s are expressed as products of primes and these two expressions are 
multiplied together, the result is an expression for rs as a product of primes. 
By Theorem 3.8 we deduce that the prime divisors of rs are precisely the 
prime divisors of r together with the prime divisors of s. Since n is a prime 
it follows that n must be either a prime divisor of r or a prime divisor of 
s. Hence either r = 0 (mod n) or s = 0 (mod n); that is, either f — 0 or 
s = 0. Thus we have proved that it is impossible for the product of two 
nonzero elements of Z n to be zero. So Z n has no zero divisors. Since also Z n 
is commutative and has an identity element (Theorem 4.9), Z n is an integral 
domain. □ 
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So if p is prime Z p is an integral domain. However even more is true—in 
fact Z p is a field in this case. To prove this one must show that each element 
of Zp has a multiplicative inverse. (Observe, for instance, that this is true 
in Z 5 . Since 2x3 = 6 = 1 it can be seen that 2 and 3 are multiplicative 
inverses of each other, while similar calculations show that 1 and 4 are each 
their own inverse.) 

We prove a slightly more general statement, namely: 

4.11 Theorem Suppose that D is an integral domain which has only a 
hnite number of elements. Then D is a held. 

Proof. Recall that an integral domain is a commutative ring with identity 
with no zero divisors, while a field is a commutative ring with identity for 
which all nonzero elements have multiplicative inverses. So it suffices to prove 
that if a E D and 0 then there exists b G D with ab = 1. 

Let a G D with a ^ 0. Define a mapping A: D —> D by 
X(b) = ab for all b E D. 

We will show that A is injective. That is, we show that A( 6 ) and A(c) can 
only be equal if b — c. 

Suppose that A (b) — A(c). Then ab = ac, and so ab — ac — 0. By the 
distributive law we obtain a(b — c) — 0. But D has no zero divisors (since 
it is an integral domain), and since a 7 ^ 0 it follows that, b — c — 0 ; that is, 
b — c. Hence A is injective, as claimed above. 

Now suppose that 61 , 62 , ..., bk are all the distinct elements of D. 
Then A(&i), A(& 2 ), • • •, X(bk) are all distinct (since A (hf) = A (bj) would imply 
that b y — bj). But D has only k elements altogether; so each element of D 
is equal to some A {pi). In particular the identity element 1 equals A {bf) for 
some i. That is, 1 = A( 6 ) = ab for some b G D. So a has a multiplicative 
inverse, namely b. This argument applies to any nonzero element a of D, and 
so it follows that D is a field. □ 

Note also that even if n is not prime an element a of Z n will have a 
multiplicative inverse if gcd(a, n) — 1. This is so since by Theorem 3.3 there 
will exist integers r and s with ra + sn = 1 , giving 

ra = 1 — sn = 1 (mod n) 

so that ra = 1 , and r is the inverse of a. 
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- Example - 

#1 Calculate the inverse of the element 24 in Ziooi 

^—t> Apply the Euclidean Algorithm to find integers r and s which satisfy 
24r + 1001s = 1: 

1001 =41 x 24 + 17 
24 = 1 x 17 + 7 
17 = 2 x 7 + 3 
7 = 2 x 3 + 1 

Substituting back gives 

1 = 7 — 2x3 = 7 — 2(17 -2x7)= 5(24 - 17) - 2 x 17 

= 5 x 24 - 7(1001 - 41 x 24) = 292 x 24 - 7 x 1001. 

Thus 292 is the required inverse. <3 — 


Exercises 

1 . Prove that congruence modulo n is reflexive and symmetric. 

2. Prove that Z n satisfies Axioms (iv), (v) and (vi) of Definition 2.2. 

3. Use mathematical induction to show that 6 n = 6 (mod 10) for every 
positive integer n. 

4. Let m be an odd positive integer. Show that 

(i) m 2 = m (mod 2 m) 

(ii) m 2 = 1 (mod 4). 

5. Find all solutions of the congruence 54x = 13 (mod 37). 

6 . Find the remainder when 19 15 is divided by 36. 

7. Find all the zero divisors in the rings Zs and Z 2 + Z 2 + Z 2 . 

8 . Show that an integer is divisible by four if and only if the sum of the 
units digit and twice the tens digit is divisible by four. 
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9. Suppose that m is a positive integer, and let C be one of the con¬ 
gruence classes modulo m. Prove that if a G C and b e C then 
gcd(a, m) = gcd (b,m). 

10 . Determine if the elements 13 and 14 have inverses in the ring Z 22 , and 
find the inverses if they exist. 

11. Prove that if m\n and a = b (mod n) then a = b (mod m). 

12. Let X — { — 1, 0,1} and define the relation ~ on Z + by the rule 

a~&ifa-6eX. 

Show that ~ satisfies two of the properties of an equivalence relation, 
and give an example to indicate why the third property in not satisfied. 

13. Define the relation ~ on M by the rule 

x ~ y if and only if y — x G Z. 

Prove that ~ is an equivalence relation and describe the eqivalence 
classes. 

14. For the given set and relation below, determine which define equivalence 
relations, giving proofs or counterexamples: 

(z) S is the set of all people living in Australia; a ~ b if a lives within 
100 km of b. 

(ii) S is the set of all integers; a ~ b if a > b. 

(Hi) S is the set of all subsets of a finite set T; a ~ b if a and b have 
the same number of elements. 



5 

Some Ring Theory 


In this chapter we introduce some of the concepts which are needed to study 
abstract rings, and prove the first, theorems of the subject. 

§5a Subrings and subfields 


5.1 Definition (i) A subset S' of a ring R is called a subring if S is itself 
a ring with respect to the operations of R. 

(ii) A subset S of a field F is called a sub field if S is itself a field with respect 
to the operations of F. 

For example, the ring Z is a subring of the field M, but not a subfield. 
The rational numbers, Q, form a subfield of M, which is in turn a subfield of 
C. The even integers, 2Z, form a subring of Z. 

5.2 Theorem Let R be a ring and S a subset of R such that 

(i) S is nonempty, 

(ii) S is closed under multiplication (that is, ab € S for all a, b 

(iii) S is closed under addition (a + b e S for all a, b 

(iv) S is closed under forming negatives (—a G S for all a 

Then S is a subring of R. 

Conversely, any subring of R has these four properties. 

Proof. Assume first that S is a subring of R. We must prove that the four 
properties above are satisfied. 

Since S' is a ring it must have a zero element. So S ^ 0, and the first 
of the properties holds. Note also that if z is the zero of S and 0 the zero of 
R then z + z — z (by the defining property of the zero of S) and z + 0 — z 


eS), 

eS), 

eS). 


52 
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(by the defining property of the zero of R ), so that by Theorem 2.10 (i) we 
must have z — 0. 

Now let a and b be arbitrary elements of S. Since the operations of 
addition and multiplication in R define operations on S we must have that 
a + b and ab are elements of S. So properties (ii) and (iii) hold. Furthermore, 
since a must have a negative in S there must exist x G S such that 

a + x — z — x + a. 

But since z — 0 these equations also say that x is a negative of a in R. By 
Theorem 2.9 it follows that x = —a, and we have proved that —a G S, as 
required. 

For the converse we must assume that S satisfies properties (i)-(iv) 
and prove that it satisfies Definition 2.2. Observe that properties (ii) and 
(iii) guarantee that the sum and product in R of two elements of S are 
actually elements of S ; hence the operations of R do give rise to operations 
on S. It remains to prove that Axioms (i)-(vi) of Definition 2.2 are satisfied 
in S. In each case the proof uses the fact that since I? is a ring the same 
axiom is satisfied in R. The hardest part is to prove that the zero element of 
R is actually in S ; so let us do this first. 

We are given that S is nonempty; hence there exists at least one element 
s G S. By property (iv) it follows that — s G S , and so by property (iii) 

0 — s T (— s) G S. 

Let a, b, c G S. By Axioms (i), (iv), (v) and (vi) in R we have 

(a + b) + c = a + (b + c) 
a + b = b + a 
( ab)c — a(bc ) 
a(b + c) — ab + ac 
(a + b)c — ac + bc 

and so it follows that Axioms (i), (iv), (v) and (vi) are satisfied in S. 

We proved above that 0 G S. Now if a is any element of S we have (by 
Axiom (ii) in R) that a + 0 = a = 0 + a, and therefore 0 is a zero element for 
S. Moreover by property (iv) we have that —a G S ; thus each element of S 
has a negative in S. So S satisfies Axioms (ii) and (iii). □ 
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Comments »> 

5.2.1 In the above proof we have also shown that every subring contains 
the zero element of the ring. 

5.2.2 The point of proving theorems is that the work which goes into 

proving them never has to be repeated. One has simply to check that the 
hypotheses of the theorem are satisfied to be able to assert that its conclusion 
is satisfied, without repeating the steps of the proof. In particular, if we have 
to prove that something is a ring we can usually contrive to use a theorem 
(such as the above) in whose proof the tedium of checking the axioms one by 
one has already been dealt with. »> 


- Examples - 

#1 Prove that S — {a + h\J 3 | a, b e Z } is a subring of M. 

^—t> By Theorem 5.2 it is sufficient to check that S is nonempty and sat¬ 
isfies the three closure properties. 

It is obvious that S is nonempty—for example 0 = 0 + 0/3 6 S. 

Let a, (3 G S. We must show that a/3, a + (3 and —a are all in S. By 
definition of S we have a — a + by/ 3 and f3 — c + dy/3 for some o, 6 , c, d G Z. 
Thus 

a + (3 — ( a + c) + (b + d/y/ 3 
a (3 — (oc + 3 bd) + (ad + bc)y/ 3 
—a — (—a) + (—b)V 3. 

In each case the right hand side has the form (integer)+(integer) a/ 3, and so 
a + (3, af3 and — a are all in S, as required. <3— 


#2 Prove that 


5 = 


a b 
0 c 


a, 6 , c G Z 


is a subring of Mat(2,Z). 


>— 1 > S / 0 since ( jj ) e S. 
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Let a = 


a b 
0 c 


and j3 = 

cn -\- /3 — 

a(3 

n 


1 1 e 
0 / 


a + 

d 

b + e 

0 


c + f 

ad 

ae 

+ bf' 

0 


°f 

—a 

— 

b\ 


0 -c 


Hence the closure properties hold. 


be arbitrary elements of S. Then 

eS, 
eS, 

G S. 


#3 Let S = { 
Mat(2, Z). 


0 a 
b c 


a, 6 , c G Z >. Prove that S' is not a subring of 


^—t> The multiplication operation on Mat(2, Z) does not yield an operation 
on S, since the product of two elements of S need not be in S. For example, 


but 



and 







#4 Let R = Zg and let S C R be given by 

S = { 6 , 2 , 4 , 6 }. 


Prove that S' is a subring of Zg. 

>— 1 > Observe that S , = { 2 r|rGZ}. Obviously S' ^ 0. 

Since 

2 r + 2s = 2r + 2s G S, 

and 

2r2s = 4rs = 2(2rs) G S, 

and 

-(2?) = 2(—r) G S', 

the required closure properties hold. < 1 —C 
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#5 The subset {1, 3, 5, 7} of Zg is not a subring. It is closed under multi¬ 
plication but not addition. 


5.3 Theorem Let F be a Geld and S a subset of F such that 

(i) 0 G S and 1 E S, 

(ii) if a E S and b E S then a + b E S and ab E S, 

(in) if a E S then —a G S, 

(iv) if a G S and q^O then a -1 G S. 

Then S is a sub&eld of F. 

Conversely, any subGeld of F satisGes these properties. 

Proof. Since F is a field it is certainly a ring. Suppose that. S' is a subset 
of F satisfying the properties above. Then S satisfies properties (i)-(iv) of 
Theorem 5.2, and so by Theorem 5.2 it follows that S' is a subring of F. 
Hence S' is a ring. 

We are given that the identity element of F is in the subset S'; hence 
S has an identity element. The identity element is nonzero, since by Defi¬ 
nition 2.8 applied to F we know that 1^0. Furthermore, ab = ba for all 
a, b E S, since by Definition 2.8 the same is true for all a, b E F. Finally, if 
a is a nonzero element of S' then property (iv) above guarantee that a has an 
inverse in S. We have now checked that all the requirements of Definition 2.8 
are satisfied; thus S' is a field, and therefore a subfield of F. 

The proof of the converse (that any subfield has the listed properties) 
is similar to the first part of the proof of Theorem 5.2, and is left to the 
exercises. □ 


- Example - 

#6 S — {a + b\J 3 | o, b E Q } is a subfield of M. 

The hardest part of the proof is to show that S is closed under forming 
inverses of its nonzero elements—part (iv) of 5.3. 

Suppose that a, b E Q and that a + b \/3 ^ 0. Then a 2 — 3 b 2 ^ 0 (by 
§3c^2), and we find that 


a 


a* 


3 b 2 


+ 


3 b 2 


V3. 


(a + bV 3) 1 
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Since the coefficients on the right hand side are rational numbers, the result 
follows. 


§5b Homomorphisms 

5.4 Definition Let R and S be rings. A mapping 9: R —> S is called a 
ring homomorphism if 

9(a + b) = 9(a) + 9(b) 

and 

9(ab) = 9 (a) 9(b) 


for all a and b in R. 

Definition 5.4 is sometimes expressed as follows: 

A ring homomorphism from R to S is a mapping 
which preserves addition and multiplication. 


Some elementary properties of homomorphisms are listed in the next 
theorem. 

5.5 THEOREM If 9: R —» S is a ring homomorphism then 

(i) 9 (Or) = Os (where ‘Or means ‘zero element of R' and ‘0 s’ means 
‘zero element of S’), 

(ii) 9(—a) — —9(a) for all a E R, 

(Hi) 9(a — b) — 9(a) — 9(b) for all a, b e R, 

(iv) 9(a\a 2 ■.. a n ) = 9(ai)9(a 2 )... 9(a n ) for all ai, a 2 , ..., a n e R, 

(v) 9(a\ + a 2 H-f a n ) = 9(a{) + 9(a 2 ) H-f 9(a n ) for all e R. 

Proof, (i) Using the defining properties of 0 r and Os and the fact that 9 
preserves addition we have 


^(0_r) + 9 (Or) — 9(0r + Or) — 9 (Or) — 9 (Or) + Os 
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and by Theorem 2.10 (i) it follows that 0(0#) = Os- 

(ii) Let a G R. Since 6 preserves addition 

9(a) + 9(—a) = 9 (a + (-a)) = 0(0#) = 0 5 

by part (i). Applying the commutative law for addition we obtain that 
0(— a) + 0(a) = Os also. By Theorem 2.9 (uniqueness of negatives) it fol¬ 
lows that 0 (— a) — — 0 (a). 

(iii) Let a, b G R. We have 

9(a -b) = 9(a + (- 6 )) = 0(a) + 9(-b) = 0(a) + (-0(6)) = 0(a) - 0(6). 

(iv) If n — 1 this is immediate. Preceding by induction we assume that 
n > 1 and the statement holds with n — 1 in place of n. Let ai, a 2 , ..., 
a n G /?. Using the fact that 0 preserves multiplication and the induction 
hypothesis we obtain 

0 (aqa 2 • • • a n ) 0 ((aia 2 • • • i)®n) 

= 0 (oia 2 ...a n _i) 0 (a n ) 

= ( 0 (ai) 0 (a 2 ) • • • 0 (a n _i)) 0 (a n ) 

= 0(ai)0(a 2 ) ■ • • 9(a n ) 

as required. 

(v) The proof of this is similar to the proof of (iv) and is omitted. □ 

Comment t>» 

5.5.1 If R has an identity element 1 then it is not necessarily true that 
0(1) is an identity element of S. If 0 is surjective (onto), however, then 0(1) 
must be an identity. See the exercises at the end of the chapter. 

5.5.2 If S' is a subring of the ring R and 0: i? —>• T is a homomorphism 

then the restriction of 0 to S —the mapping </>: S —> T given by cj)(s) = 0(s) 
for all s G S —is clearly a homomorphism. »> 

- Examples - 

#7 Define 0: Z —■> 7L n by 0(a) = a for all a G Z. (That is, 0 takes any 
integer to its congruence class modulo n .) By definition of addition and 
multiplication of congruence classes 

0 (a + 6 ) = a + & = a + 6 = 0 (a) + 6(h) 

9 (ah) — ah — ab — 9 (a) 9(b) 
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So 6 is a homomorphism. (This is an example of a homomorphism from a 
ring to a quotient ring of itself. Whenever a quotient ring can be formed such 
a homomorphism exists.) 

#8 Let 

e(a + bi)=^ b *). 

define a map 9: C —» Mat(2,M). Since we liavef 


and 


d((a + hi) + (c + di)) = d((a + c) + (6 + d)i) 

( a + c b + d\ 

y — (b + d) a + c J 

__ fa b \ / c d \ 

~ \ -b a) + \-d cj 

= 9 (a + hi) + 9{c + di), 

9 {{a + 6i)(c + di)) = d((ac — bd) + (ad + bc)i) 

( ac — bd ad + be \ 
y — (ad + 6c) ac — bdj 

( a b\ f c d\ 
y — b a J y —d c y 

= 9(a + bi)9(c + di) 


it follows that d is a homomorphism. 


If a homomorphism d: R —» S' is bijective (one-to-one and onto) then it 
sets up a one-to-one correspondence between elements of R and elements of 
S such that the sum of the elements of S corresponding to two given elements 
a, b G R is the element corresponding to a + 6 , and similarly their product is 
the element of S corresponding to ab. Thus the rings R and S are essentially 
the same as one another—although they have different elements, R and S 
have the same underlying additive and multiplicative structure. 


f Here and subsequently the boldface letter i denotes a complex square root 
of-f 
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5.6 DEFINITION A ring homomorphism which is bijective is called an iso¬ 
morphism. If there exists an isomorphism 6: R —> S then R and S are said 
to be isomorphic , and we write 1 R = S \ 


- Examples - 

#9 The map 

+ a) 

provides an isomorphism from C to the subring S of Mat(2,M) consisting of 
all matrices of the form 

To prove this one must prove first that S' is a subring of Mat(2, M), then 
prove that 6 preserves addition and multiplication and is bijective. We omit 
the proof that S' is a subring since it is similar to the proofs in #1 and ^2 
above. 

Let a, /3 E C with 6(a) — 9(/3 ). We have a — a + bi, /3 — c + di for 
some o, b, c, d E M, and since 

(-6 a) = »(«+«) = »(c+di)= (_ c d fj 



we see that a — c and b — d, whence a — j3. Thus 6 is injective. 


Let A be an arbitrary element of S. Then A — 


a 


for some 


a, b E M, and so A — 9 (a + bi). Hence 9 is surjective. This completes 
the proof, since we have already seen in ^8 that 9 preserves addition and 
multiplication. 


Note that the fact that 9 preserves addition and multiplication, together 
with the fact that S — im 9, can be used to prove that S is closed under 
addition and multiplication, as follows. Let A, B E S. Then A — 9(a), 
B — 9(/3) for some a, j3 E C, and so 


A + B — 9(a) + 9(/3) — 9(a + (5) e im 9 = S 
AB = 9(a)9(/3) — 9(a/3) E im 9 = S 
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#10 If R is any ring the direct sum R + R is isomorphic to the subring of 
Mat(2, R) consisting of all diagonal matrices. The map 


(a, 6 ) 


a 0 
0 b 


is an isomorphism. 

Let ip be the given map. By the definition of addition and multiplication 
in R + R we have 

(a, b ) + (c, d) — (a + c,b + d) 

(a, b)(c, d) = ( ac , bd) 

so that 


^((a,6) + (c,d)) = ip(a + c,b + d) = + C & ° ^ 

= (o i) + (o “) =V>(«.6) + V>( C,d), 

and ip ((a, b)(c, d)) — ip(ac, bd) = ip (a, b)ip(c, d) similarly. 

We have shown that ip preserves addition and multiplication. It is also 
necessary to show that ip is bijective and that the set of all diagonal matrices 
is a subring of Mat(2, R). These proofs are straightforward and are omitted. 

#11 Prove that if F is a field and R a ring is isomorphic to F then R is 
also a field. 

^—t> We must show that R is commutative and has a nonzero identity 
element, and that all nonzero elements of R have inverses. 

Let <p:F —► R be an isomorphism, and let x, y G R. Since (p is surjective 
there exist a, b e F with <p(a) — x and (p(b) = y. Multiplication in F is 
commutative (since F is a field); so ab = 6a, and 

xy = <p(a)<p(b) = <p(ab) — <p(ba ) = <p{b)<p(a) — yx 

showing that R is commutative. 

By Exercise 2 at the end of the chapter, <p{ 1) is an identity for R. Since 
0 ( 0 ) = 0 and 1^0 the fact that <p is injective shows that 0 ( 1 ) is nonzero. 
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Suppose that a; is a nonzero element of R. Then x — 0(a) for some 
a E F, and since 

0(0) = 0^x = 0(a) 

injectivity of 0 gives a ^ 0. Since F is a field it follows that a has an inverse, 
and 

xcj)(a~ 1 ) = (j)(a~ 1 )x — 0 (a“ 1 ) 0 (a) = 0 (a - 1 a) = 0 ( 1 ). 

Thus 0(a -1 ) is an inverse for x\ so all nonzero elements of R have inverses. 


§5c Ideals 

5.7 Definition A subring I of a ring R is called an ideal of R if ar e I 
and ra G I for all a G I and r G R. 

Comment »> 

5.7.1 If I is an ideal in R then multiplying an element of I by any 
element of R and must give an element of I. Note that this is a more 
stringent requirement than closure under multiplication, which merely says 
that the product of two elements of I lies in I. An ideal must be closed under 
multiplication by arbitrary elements of the ring. »> 

- Example - 

#12 Let R — Z and I — 2Z. Then I is nonempty (0 G 2Z), closed under 
addition (the sum of two even integers is even), closed under multiplication 
(the product of two even integers is even), and closed under forming nega¬ 
tives (the negative of an even integer is even). So I is a subring of R. To 
observe that in fact it is an ideal it remains to show that I is closed under 
multiplication by arbitrary elements of R —that is, show that the product of 
an even integer and an arbitrary integer gives an even integer. But this is 
obvious. 


Note that in the above example it was not really necessary to prove 
closure under multiplication separately since it follows from closure under 
multiplication by ring elements. This observation yields the following propo¬ 
sition: 
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5.8 Proposition A subset I of a ring R is an ideal if and only if the 
following all hold: 

(i) I is nonempty. 

(ii) For all x and y, if x E I and y E I then x + y E I . 

(in) For all x, if x E I then —xEl. 

(iv) For all x and y if x E I and y E R then xy E I and yx E I . 

Proof. Suppose first that I is an ideal of R. Then / is a subring of R , and 
by Theorem 5.2 properties (i), (ii) and (iii) above all hold. Property (iv) 
holds too since it is explicitly assumed in the definition of an ideal. 

Conversely, assume that I satisfies properties (i)-(iv). As remarked 
above it follows from property (iv) that I is closed under multiplication; thus 
all the requirements of Theorem 5.2 are satisfied, and it follows that I is a 
subring of R. This together with property (iv) shows that I is an ideal. □ 


- Example - 


#13 Let 


and 


R = 


I = 


0 d 
0 0 


a, 5, c G Z 


d EZ 


Prove that R is a subring of Mat (2, Z) and / is an ideal of R. 

^—> That R is a subring of Mat(2,Z) was proved in #2. Clearly I is 
nonempty—for instance, ^ ^ ^ j 6 /. Let x and y be arbitrary elements of 

/ and r an arbitrary element of R. Then for some integers a, 6 , c, d, e, 


x — 


0 a 
0 0 


giving 


x + y — 


—x — 


V = 

a + b 
0 


—a 

0 


0 b 
0 0 


r = 


rx = 


xr 


c d 
0 e 


oc 

0 


ae 

0 


and since these are all in I it follows that I is an ideal. 


<-C 
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§5d The characteristic of a ring 

Let R be any ring. If a E R we define 


la = a 
2 a — a + a 
3a — a -|- a -|- a 

and so on. In general, if m is any positive integer, 

ma — a + a + ■ ■ ■ + a. 

" -V-' 

m terms 

If m is a negative integer we define 

ma = — ((—m) a) 

observing that (— m)a has already been defined since —m is positive. And 
for the case m = 0 we define Oa = 0. We have now defined ma whenever 
m E Z and a € R. This is a method of multiplying ring elements by integers, 
and is not to be confused with the multiplication operation within R itself. 
(But—fortunately—the value 0a is the same whether 0 is interpreted as an 
integer or the zero of the ring, and the same applies to la if R has an identity 
element.) 

Similarly we define a m = aa.. .a (m factors) if m E Z , if R has an 
identity element 1 we define a 0 = 1 for all a E R ; if m is negative and a E R 
has an inverse we define a m = (a~ 1 )~ m . The following should be clear: 

5.9 Proposition Let R be any ring, a E R and m, n e 7L. Then 

(i) m(na ) = ( mn)a and (m + n)a = ma + na, 

(ii) (a m ) n = a mn and a m+n = a m a n . 

(If either m or n is negative the second part is only applicable if a has an 
inverse; similarly, if either m or n is zero it is only applicable if R has an 
identity.) 

5.10 DEFINITION Let I? be a ring. If there is a positive integer n such that 
na = 0 for all a E R then the least such n is called the characteristic of R. 
If there is no such n then R is said to have characteristic 0. 
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- Examples - 

#14 The characteristic of Z 2 is 2, the characteristic of Z 3 is 3, and so on. 
#15 Let S be the subring of Zg given by S — {0, 2 ,4, 6 }. Observe that 
2 ^ 0 , 2+2=4^ 0 , 2 + 2 + 2 = 6 ^ 0 . 

So the characteristic of S is not f, 2 , or 3. But 

2 + 2 + 2 + 2 = 8=0 
4 + 4 + 4 + 4 = 16 = 0 
6 + 6 + 6 + 6 = M = 0 
0 + 0 + 0 + 0 = 0 . 

So S has characteristic 4. (This shows that the characteristic of a subring 
can be less than the characteristic of the ring, since Zg has characteristic 8.) 


If a ring R has an identity element, f, then na — 0 for all a G R if and 
only if n 1 = 0. From this we can deduce the following proposition: 

5.If Proposition If R is a ring with identity element 1 then the charac¬ 
teristic of R is the least positive integer n such that nl = 0, or zero if there 
is no such n. 

Proof. Define 

H — { to G Z + | ml = 0 } 

and 

A' = {m 6 Z + | ma — 0 for all a G R }. 

We prove that H = K. 

Let m G H. Then ml = 0, and so for all a G R we have 
ma = a + a + • • • + a = a(l + 1 + ■ ■ • + 1) = a(ml) = aO = 0. 

'-v-' "-V-'' 7 

m terms m terms 

Hence m G K. 

Conversely, if m G K then ma — 0 for all a G R, and, in particular, 
ml = 0, whence m G H. Thus m G K if and only if m G F, and so H — K, 
as claimed. 

By Definition 5.10 the characteristic of R is the least element of K , or 
zero if K — 0. Since H — K this shows that the characteristic of R is the 
least element of 1L, or zero if H — 0, and this is precisely the assertion of 
Proposition 5.11. □ 
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5.12 Theorem Let R be a ring with identity 1^0 and let S be the subset 
of R given by S — {nl | n G Z}. Then S is a subring of R and 

(i) if R has characteristic 0 then S = Z, 

(ii) if R has characteristic m ^ 0 then S = Z m . 

Proof. S is nonempty since, for instance, 1 E S. Let a, b G S. Then 
a = nl, b — ml for some n, m G Z, and, by Proposition 5.9, 

a + b = nl + ml — (n + m)l E S 
ab — (nl)(ml) = (nm) 1 G S 
—a — — (nl) = (—n)l G S 

so that by Theorem 5.2 it follows that S is a subring of R. 

(i) Assume that R has characteristic 0. 

Define a function ip: Z —> S by 

ip(r) = rl for all r G Z. 


We have 


ip(r + s) = (r + s)l = rl + si = ip(r) + ip(s) 
ip(rs) — (rs)l = (rl)(sl) = ip(r)ip(s) 

so that ip is a homomorphism. To show that S = Z it remains to show that 
ip is bijective. 

Let a G S. Then, for some n G Z, a = nl = ip(n). Thus ip is surjective. 

Let r, s G Z be such that ip(r) — ip(s). Then rl = si, and hence 
(r — s)l = 0. But since R has characteristic 0 we know by Proposition 5.11 
that there is no positive integer n such that nl = 0. So r — s < 0, and by 
the same reasoning s — r < 0. Hence r — s, and we have shown that ip is 
injective. 

(ii) Suppose that the characteristic of R is m > 0. Then m is the least 
positive integer which annihilates 1 (in the sense that ml = 0). We show 
first that if r, s G Z then rl = si if and only if r = s (mod m). 

If r = s (mod m) then r — s — tm for some f G Z, and 
rl = (s + tm) 1 = si + (tm) 1 — si + t(ml) = si + fO = si + 0 = si. 
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Suppose, conversely, that rl = si. Then (r — s)l = 0, and, by what we 
have just proved, k 1 = 0 for all k 6 Z with k = (r — s) (mod to). By 
Theorem 4.5 (ii) we may choose such a k with 0 < k < m. But since there 
is no positive integer less than to which annihilates 1 it follows that k — 0. 
Thus (r — s) = 0, and therefore r = s. 

We now define a function 6: Z m —► S by 9(f) = rl for all r G Z; this is 
unambiguous since if f = s then r = s and hence rl = si, and it defines 6 
on all elements of Z m since every element of Z m is of the form r. 

By the definitions of addition and multiplication in Z m we find that 

9(f + s) = 9(r + s) = (r + s)l = rl + si = 6(f) + 9(s) 

9(fs) = 9(fs) = (rs)l = (rl)(sl) = 9(f)9(s) 

so that 9 is a homomorphism. Since every element of S is of the form 
rl = 9(f) for some r 6 Z it follows that 9 is surjective. If 9(f) = 9(s) 
then rl = si so that, as shown above, r = s, and f = s. Thus 9 is injective 
also, and S = Z m . □ 


- Example - 

#16 In this example we will be dealing simultaneously with rings Z 2 , Z 3 
and Zg. To avoid confusion we will use the following notation (for a G Z): 

‘ 0 ’ denotes the congruence class of a modulo 2 , 

‘ o ’ denotes the congruence class of a modulo 3, 

‘ a ’ denotes the congruence class of a modulo 6 . 

(Obviously it would not be suitable to use ‘ a ’ for all of these three!) 

We have that 

z 2 = {o,i} 

Z 3 = {0,1,2} 

Z 6 = {0,1,2, 3,4,5}- 

Consider now the ring Z 2 + Z 3 , which consists of all pairs (a, b) with 
a G Z 2 and 6 6 Z 3 . It has six elements, since there are two choices for a and 
three choices for b. The element (1,1) is easily seen to be an identity element 
(since 1 is an identity for Z 2 and 1 is an identity element for Z 3 ) and to be 
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nonzero. For any n G Z + we have 

77,(i, I) = (i, i) H-F (i, i) 

s -.-' 

n terms 

= (nl, nl) 

— (h, h) 

and this equals the zero element (0, 0) if and only if h = 0 and n — 0. But 
n — 0 if and only if n is even, and h = 0 if and only if n is divisible by three. 
So n( 1,1) = (0, 0) if and only if n is divisible by six. Hence the characteristic 
of Z ,2 4 - Z 3 is six. 

By Theorem 5.12 the set {n(l, 1) | n E Z} is a subring of Z 2 + Z 3 
isomorphic to Zg. However, Z 2 + Z 3 has only six elements altogether, and so 
this subring must equal the whole of Z 2 4- Z 3 . So we have shown that 

Z 2 + Z 3 = Zg. 

The proof of Theorem 5.12 can be used to write down an explicit isomorphism 
0: Zg —» Z 2 T Z 3 . We obtain: 

1 ^ ( 1 , 1 ) 

2 ^(i,l) + (i,l) = ( 0 , 2 ) 

3^(i,i) + (i,i) + (i,i) = (i,o) 

4 4(i, 1) = (4,4) = (0,1) 

5^5(i,l) = (5,5) = (i,2) 

6 ^ 6 ( 1 , 1 ) = ( 6 , 6 ) = ( 6 , 0 ). 

(Note that 6 = 6 .) 


Exercises 

1 . Complete the proof of Theorem 5.3. 

2. Let R and S be rings and let 6: R —» S be a surjective ring homomor¬ 
phism. Let R have an identity element 1. Prove that 6(1) is an identity 
element in S. 
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3. In the ring Z find: 

(i) the smallest subring containing 7, 

(ii) the smallest subring containing 5 and 7. 

4. Prove that Con (the constructible numbers) is a subfield of R. 

(Hint: Use Theorems 5.3 and 1.1.) 

5. Suppose K is a subfield of C (the complex numbers) which contains R 
(the real numbers). Show that either K — R or K — C. 

(Hint: Assume that K contains R and some complex number not 
in R. and show that K contains all complex numbers.) 

6 . Let C be the field of complex numbers and let 9: C —»■ C be defined by 
the formula 

9(a + ib) — a — ib for all a, b E M. 

Show that 6 is a ring isomorphism from C to C. 

7. Prove that isomorphic rings have the same characteristic. 

8 . Let D be an integral domain with nonzero characteristic m. Prove that 

m is a prime. (Hint: If m — rs then (rl)(sl) = 0.) 

9. Let R and S be rings, and let 6: R —» S be an isomorphism. Prove that 
a E R is a zero divisor if and only if 9(a) E S is a zero divisor. Deduce 
that isomorphic rings have the same number of zero divisors. 

10 . Prove that Z 12 and -j- Z 2 are not isomorphic. 

11. Define 

Q[\/2] = { a + b \/2 + c | a, b, c E Q }. 

Prove that Q[ v^2] is a subring of M, and prove also that it is an integral 
domain. 

12. Let Q 2 — { p- | vn E Z and k E Z + j, the set of all rational numbers 
with denominator a power of 2. Show that Q 2 is a subring of Q but not 
a subfield. 
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13. Let 

{ (a b c\ 

I 0 d e a, b, c, d,e, f,g E Z 

\o f g) 

Is R a ring with respect to the operations of addition and multiplication 
defined as usual for matrices? 

Let 0: R —>• Mat(2, Z) be the mapping defined by 


0 


a b c 
Ode 
0 f g 



(i) Is 6 injective? 

{ii) Is 9 surjective? 

(in) Is 6 a homomorphism? 

( iv ) Define a relation ~ on R by 

( a b c\ / p q r 

0 d e J ~ | 0 s t 

0 f g) \0 U V 

Is ~ an equivalence relation? 

( v ) Define ~ on R by 

( a b c\ / p q r\ 

0 d e ~ 0 s t if and only if b = q and c — r. 

0 f g) \0 u v) 

Is ~ an equivalence relation? 

(vi) Suppose that X,Y, Z,W E R are such that X ~ Y and Z ~ W. 
Is it true that X + Z ~ Y + W? Is it true that XZ ~ YW? 

( vii ) The same as (vi) with ~ replaced by 

(viii) Let R be the set of all the equivalence classes into which R is 
partitioned by the relation Show that there is a one to one 
correspondence between R and Mat(2,Z). 




Polynomials 


Given a ring R it is possible to form new rings containing R as a subring 
by “adjoining” new elements to R. The simplest example of this is the 
ring of polynomials in X with coefficients from R. and it is necessary to 
study polynomial rings before dealing with the general problem of adjoining 
elements. The study of geometrical constructions leads naturally to this 
problem. For, suppose that op, « 2 , ..., a n are the points obtained in a 
geometrical construction, and let Si be the smallest subring of M containing 
the coordinates of a i, « 2 , • • •, cti- Then Si + 1 can be thought of as obtained 
from Si by adjoining the coordinates of a.i+\. 


6a Definitions 


6.1 DEFINITION Let R be a commutative ring which has a nonzero identity 
element 1. A polynomial in the indeterminate X over R is an expression of 
the form 

(*) a o + ai X + • • • + a n X n 

where n is a positive integer and a o, a\, a 2 , ■ ■ ■, a n E R. We call cp the i th 
coefficient of the polynomial. 

Comments »[> 

6.1.1 It is possible to define polynomials over arbitrary rings, but in this 
course we will only talk about polynomials over commutative rings with 1. 

6.1.2 The coefficients of the polynomial are elements of the ring R , but 
X is not. In fact X , X 2 , X 3 , ... are nothing more than symbols written 
alongside the coefficients to enable us to see which is the 0 th , which the 1 st , 
which the 2 nd , and so on. Indeed, in some treatments of the topic the symbols 
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X , X 2 , ... are not used in the definition, and a polynomial is defined to be a 
sequence of ring elements (ao, Oi,..., a n , 0, 0,...). So a polynomial is nothing 
more than its coefficients. Accordingly, to say that two polynomials p and q 
are equal is to say that for each i the i th coefficient of p is equal to the i th 
coefficient of q. 


6.1.3 The terms in the expression (*) above may be written in any order, 
and if cq = 0 the corresponding term may be omitted. Similarly we may omit 
unnecessary coefficients equal to 1 (writing l X' instead of ‘IX’, and so on). 
Thus if 


p — 2 + X 3 — 5X 


then the 0 th coefficient of p is 2, the 1 st coefficient is —5, the 2 nd is 0, the 
3 rd is 1. It is also convenient to say that the 4 th , 5 th , ... coefficients are zero 
(rather than saying that they do not exist). Thus a polynomial always has 
an infinite sequence of coefficients, one for each nonnegative integer, but all 
the coefficients beyond some point must be zero. >» 


6.2 Definition The polynomial all of whose coefficients are zero is called 
the zero polynomial. 

6.3 Definition If p is a polynomial the largest i for which the i th coef¬ 
ficient is nonzero is called the degree of p, and this coefficient is called the 
leading coefficient of p. 

So if p — ao + a\X + • • • + a n X n with a n ^ 0 then a n is the leading 
coefficient and deg(p) = n. Note that we do not define the degree of the zero 
polynomial. In some treatments the zero polynomial is said to have degree 
—oo. It would not be suitable to define the degree of the zero polynomial to 
be zero. 

If p is a polynomial in the indeterminate X we often write l p{X)' instead 
of just 'p to remind ourselves that p is a polynomial or to remind ourselves 
that the indeterminate is X. 

Notation. The set of all polynomials over R in the indeterminate X is 
denoted by ‘i?[X]’. 
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§6b Addition and multiplication of polynomials 

6.4 Definition Let R be a commutative ring with 1 and let a, b e R[X], 
Let ai, bi be the i th coefficients of a, b (for i — 0, 1, 2, ...). Define a + b to be 
the polynomial with ?’ lh coefficient a, + 6,. and define ab to be the polynomial 
with i th coefficient a*6o + + • • • + ao&; (for i = 0, 1, 2,...). 

By Definition 6.4, if 


a — g.q -|- a\X + • • • + a n X n 
b = b 0 + hX + • • • + b m X m 


then 


a + b — (oo 4- &o) 4“ ( a± + bi)X + (a .2 + b 2 )X 2 + • • • 

ab — gq^o 4“ ( a i^o 4“ a obi)X + ( 02&0 4“ a ibi + gq^ 2 )^ 2 4- • • • • 


Note that the formula for ab is obtained by multiplying out the expressions for 
a and b and collecting like terms in the usual way. (In particular, therefore, 
the i th coefficient is zero for all i sufficiently large.) 

6.5 Theorem If R is a commutative ring with 1 and X is an indeterminate 
then R[X] is a commutative ring with 1. 

Proof. We must check the axioms in Definition 2.2 and the commutative 
law for multiplication, and find an identity element. It will be convenient to 
use the same notation as in the definition above: if p is a polynomial, then 
Pi is the i th coefficient of p. 

Let a, 6, c e _R[A], Then for all i 
((a + b) + c)i — (a + b)i + Ci — (oj + bi) + c* 


a% 4- (bi + Ci) — ai + (b + c)i — (a + (b + c))i 


and so (a + b) + c — a + (b + c). The proof that a + b — b + a is similar. 

The i th coefficient of ab is the sum of all terms a r b s with r + s — i\ that 


is, 



r-\-s=i 
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in a convenient notation. We find that 

(( ah)c)i = 55 ( ah) r c s 

r-\-s=i 



u-\-v-\-s=i 


u-\-v-\-s=i 

^ ^ &U I ^ ^ b V C S 

u+t=i \v -\-s=t 

= 55 a ™( bc )t 
u+t=i 

= ( a(bc))i 

and therefore ( ah)c — a(bc). Similarly 
(a(b + c))i — 55 a r (b + c) s = 55 a r (b s + c s ) 

r-\-s=i r-\-s=i 

- ^ ^ CLyhs “(“ CLyCg - ^ ^ Ciybg “(“ ^ ^ CLyCg - i “1“ (CLC^ i 

r-\-s=i r-\-s=i r-\-s=i 

Similar proofs also apply for the other distributive law and the commutativity 
of multiplication. 

If we define z to be the polynomial for which Zi = 0 for all i then it is 
readily checked that (a + z)i — ai — (z + a)i for all a £ R[X], so that z is a 
zero element for R[X], It is also easily seen that —a defined by (—a)* = —(ai) 
for all i satisfies a + (—a) = z = (—a) + a; so each a £ R[X] has a negative. 
Finally, define e to be the polynomial for which the 0 th coefficient is the 
identity element of R and all the other coefficients are equal to zero. That 
is, e = 1 + OX + OX 2 + • • •. Then for all a £ R[X], 

( ae)i — oqe o + a*-iei + • • • + aoe* = ai 

since eo = 1 and e ? = 0 for j ^ 0. Thus ae = a, and since also ea = a 
(similarly, or by commutativity of multiplication) it follows that e is an iden¬ 
tity. □ 
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Comment »> 

6.5.1 In accordance with the rules described in 6.1.3 above, the zero 
and identity polynomials are usually denoted by ‘O’ and ‘1’, but for the proof 
of Theorem 6.5 we needed to distinguish them from the zero and identity of 
R. See also 56c below. »> 


6.6 Theorem (i) Let R be an integral domain and a and b nonzero poly¬ 
nomials over R. Then the leading coefficient of ab is the product of the 
leading coefficients of a and b, and deg (ab) — deg(a) + deg(&). 

(ii) If R is an integral domain then so is R[X]. 


Proof, (i) Let n be the degree of a and m the degree of b. If r + s > n + m 
then necessarily either r > n or s > m, and so either o r = 0 or b s = 0. Hence 
if i > n + m then 

(ab)i = ^ a rbs = 0. 

r-\-s=i 


Similarly 


(ab) n+rn 



a r b s 


r-\-s=n-\-m 


a n b 


m 


since all other terms have either r > n or s > m. Since a n yf 0 and b m ^ 0 
and R is an integral domain it follows that (ab) n + m ^ 0. Thus n + m is the 
largest value of i for which (ab)i ^ 0, and so 


deg(a&) — n + m — deg(a) + deg(&). 


Moreover, the leading coefficient of ab is (ab) n + m , and, as we have seen, it is 
equal to a n bm , the product of the leading coefficients of a and b. 

(ii) We have already proved that R[X] is a commutative ring with 1; so it 
remains to prove that it has no zero divisors. But the first part of this proof 
shows that if the polynomials a and b each have a nonzero coefficient then 
so too does ab. That is, if a ^ 0 and b ^ 0 then ab ^ 0, as required. □ 


§6c Constant polynomials 

Let R be a commutative ring with 1. For each a 6 R there is a polynomial 
for which the 0 th coefficient is a and all the other coefficients are zero. Using 
the notation described in 6.1.3 this polynomial would be denoted by ‘a’; 
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our notation does not distinguish between elements of R and these so-called 
constant polynomials. However, for the purposes of the next theorem we 
need a notation which does distinguish; so, temporarily, we will denote the 
constant polynomial a by ‘c(a)’. (That is, c(a) — a + OX + OX 2 + •••.) 

6.7 Theorem The set S — (c(a) \ a £ R} of constant polynomials is 
a subring of R[X] isomorphic to R, and the function c: R —> S defined by 
a i—> c(a) is an isomorphism. 

Proof. As in the proof of 6.5, denote the i th coefficient of p by p^. Then 
for all a G R we have c(a)o = a and c(a)i — 0 for all i > 0. 

Let a, b E R with c(a) = c(b). Then a = c(a)o = c(b) o = b. Thus c is 
injective. Since every constant polynomial is of the form c(a) for some a G R, 
c is surjective also. Furthermore, c preserves addition and multiplication, 
since 


c(a + b)o — a + b — c(a) 0 + c(6) 0 = (c(a) + c(6)) 0 
c(ab )o = ab = c(a) 0 c(b) 0 = ^ c(a) r c(b) s = (c(a)c(6)) 0 

r+s=0 

and for i > 0 

c(a + b)i = 0 = 0 + 0 = c(a)i + c(b)i — (c(o) + c(b))i 
c(ab)i = 0 = ^2 c(a) r c(b) s = ( c(a)c(b))i 

r+s=i 

(since in each term of the sum either c(a) r — 0 or c(b) s — 0). 

It remains to check that S is indeed a subring of /i'[X]. Now clearly S 
is nonempty, since it contains the zero polynomial. If x and y are arbitrary 
elements of S then x — c(a), y — c(b) for some a, b E R, and 

x + y — c(a) + c(b) — c(a + b) E S 
xy — c(a)c(b) — c(ab) E S 
— x — —c(a) — c(—a) E S 


(the last line following from the fact that the negative of a polynomial is 
obtained by taking the negatives of all the coefficients). By Theorem 5.2, S 
is a subring. □ 
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Comment »> 

6.7.1 Since the constant polynomials form a ring isomorphic to R no 
harm will come if we identify constant polynomials with the corresponding 
elements of R. In other words, we regard the ring S of 6.7 as being equal to 
R, so that R is a subring of R[X], (This fits nicely with our notation which 
does not distinguish between constant polynomials and elements of R.) Now 
72 [X] can be thought of as a ring obtained by adjoining to R a new element 
X which satisfies no more equations than it is forced to satisfy to make 72 [X] 
a commutative ring. »> 


§6d Polynomial functions 


Any polynomial p(X) — ao + aiXH- \-a n X n in 72 [X] determines a function 

R —> R by the rule 

c i—> a 0 + aqc H-b a n c n 

for all c G R. 

Polynomial functions are no doubt very familiar to the reader, but 
for us it is important to distinguish between polynomials, sometimes called 
polynomial forms, and polynomial functions. Note, for instance, that two 
distinct polynomials can give the same function. For example, if p(X) = X' 2 
and q(X) — X in ^[X] then 

p(0) = (0 ) 2 = 0 = ^(O) and p(l) — (I) 2 = 1 = g(T), 


so that p(c) — q(c ) for all c G Z 2 . So the functions c p(c) and c i—> q(c ) 
are equal. However the polynomials p and q themselves are not equal since 
they have different coefficients. 


§6e Evaluation homomorphisms 

If c is any element of R there is a function 

e c : R[X] —> R 


defined by 


p(X) 1 — >p{c) 


for all p G i?[X], In other words, e c (p(X)) = p(c) for all polynomials p. 
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6.8 Theorem For each c E R the map e c : R[X] —> R is a homomorphism. 

Proof. Let c E R and let p, q E R[X], Then, in the notation we have been 
using for the i th coefficient of a polynomial, 

e c {pq ) = (pq) o + (pq)ic + ( pq)2C 2 H- 

= PoQo + (poqi + Piqo)c + {poq-2 + piqi + P2qo)c 2 H— 

= Oo +P1C + P2C 2 H- ){qo + qiC + q 2 c 2 ^ -) 

= e c (p)e c (q ) 

and similarly 

e c (p + q) = (p + q) 0 + (p + q)ic + (p + q) 2c 2 H- 

= {po + go) + {Pi + qi)c + {P2 + q2)c 2 H- 

= {P0+P1C + P2C 2 H-)(go + gic + g 2 c 2 H-) 

= e c (p) + e c {q). 

□ 

Comments »> 

6.8.1 The function e c is called an evaluation homomorphism since it 
maps p{X) E R[X] to p{X) evaluated at c (that is, to p(c)). 

6.8.2 To say that e c preserves addition is to say that the result of adding 
two polynomials and then putting X = c is the same as first putting X = c 
in each and then adding. A similar statement applies for multiplication. The 
reason it works is because we have defined addition and multiplication of 
polynomials to make it work—when adding or multiplying polynomials the 
indeterminate X is treated as though it is an element of R. 

6.8.3 If R is a subring of a ring S then R[X] is a subring of <S[X]. For 

instance, the set of all polynomials with rational coefficients is a subring of 
the set of all polynomials with real coefficients. Hence if c is any element 
of S the homomorphism e c : S[X] —> S may be restricted to R[X] to yield a 
homomorphism from R[X] to S. Thus, for instance, the map <f): Q[X] —> M 
given by cj)(p(X)) = p{\/ 2) is a homomorphism. »> 
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§6f The division algorithm for polynomials over a field 

Our chief application of polynomials in this course will be to study field 
extensions. Roughly speaking, if F is a field we wish to be able to make 
a larger field by adjoining extra elements to F. in much the way that the 
complex numbers are obtained from the real numbers by adjoining a square 
root of —1. We have already seen how F[X] can be regarded as a ring 
obtained by adjoining the element X to F. However, F[X] is not a field, 
and to obtain fields extending F we will have to deal with quotient rings of 
F[X] —rings obtained from F[X] in the same way as Z n is obtained from 
Z. Once we have developed the theory of field extensions we will be able 
to prove things about the field Con of constructible numbers, which is an 
extension field of Q (rational numbers). 

We start by investigating properties of divisibility and factorization for 
polynomials—properties analogous to those properties of Z which were used 
in our construction of Z n . 

6.9 Theorem Let F be a Geld and f(X), g{X ) elements of F[X], with 
g(X) ^ 0. Then there exist unique q(X) and r(X) in F[X] such that both 
the following hold: 

(i) f(X) = q(X)g(X)+r(X). 

(ii) Either r(X ) =0 or deg(r(X)) < deg((?(X)). 

Proof. We first prove the existence of such polynomials q(X) and r(X). 

Let S = {/(X) - k(X)g(X) \ k{X) E F[X] }. If 0 E S then there 
is a polynomial k(X) E F[X] with /(X) = k(X)g(X ), and we may take 
q(X) — k(X) and r(X) = 0. Assume therefore that 0 ^ S. The set of non¬ 
negative integers K = (deg(p(X)) | p(X) E S } is nonempty, and therefore 
has a least element d. Let r(X) E S be such that deg(r(X)) = d, and let 
q{X) be such that /(X) — q(X)g(X) — r(X) (possible since r(X) E S). 

It suffices to prove that d < deg(r/(X)); so suppose that this is not true. 
Let deg(fif(X)) = m and let the leading coefficients of r(X) and g(X) be a 
and b respectively. Now ab~ 1 X d ~ m g(X ) has degree (d — m) + deg(g) = d and 
leading coefficient (a6 _1 )(leading coefficient of g) = a; thus it has the same 
degree and leading coefficient as r(X). It follows that the d th and all higher 
coefficients of s(X) = r(X) — ab~ 1 X d ~ m g(X ) are zero. Moreover, 

s(X) = f(X) - q(X)g(X) - ab^X^gt^X) = /(X) - k(X)g(X) E S 




80 Chapter Six: Polynomials 


where k(X) — q(X)+ab~ 1 X d ~ rn g(X). Thus s(X) is a element of S of smaller 
degree than r(A), contradicting the choice of r(X). Thus the assumption 
that d > m leads to a contradiction, and d < m, as required. 

We have still to prove the uniqueness of q and r; so assume that qi and 
ri satisfy the same two properties; that is, / = qig + rq and either rq — 0 or 
deg(ri) < deg(g). Then qig + r i — qg + r, and so rq — r — (q — q\)g. Now if 
q — qi ^ 0 then by Theorem 6.6 (i) 

deg(n - r) = deg(<? - qi) + deg(fir) > deg(fir) 

which is impossible since the i th coefficients of both ri and r are zero for 
i > deg (g). Hence qi = q , and this gives rq — f — q±g = f — qg — r also. 

□ 

Comment >t» 

6.9.1 The polynomial q(X) is called the quotient and r(X) the remain¬ 
der. They can be calculated by a process called the division algorithm , as 
follows. Firstly, if deg(/) < deg (ry) then the quotient is 0 and the remainder 
is equal to /. Otherwise we subtract from / a multiple of g with the same 
leading coefficient as /—that is, ab~ 1 X n ~ rn g(X ) where a, b are the lead¬ 
ing coefficients of /, g and n = deg(/), m = deg(g)—thereby reducing the 
degree, add the term ab~ 1 X n ~ m to the quotient, and repeat the process: 


bJ-anX 7 


H- 


b m X 771 + b m -iX m 1 + • • • + bo | a n X n + a n _iX" 1 + • • • + 


(subtract) &nX T b rn —\b rn CL n X 


'n—1 


a o 

H- 


a' n _ 1 X n 1 + --- (= new a(I)) 


-repeat. 


»> 


As a corollary of Theorem 6.9 we obtain the Remainder Theorem, which 
provides a quick method for calculating remainders on dividing by polyno¬ 
mials of degree one. 


6.10 The Remainder. Theorem Let c G F and f(X) e F[X], where F 
is a held. Then the remainder in the division of f(X) by X — c is /(c). 


Proof. By 6.9 we have f(X) — (X — c)q(X) + r, where either r = 0 or 
deg(r) <1. In either case r must be a constant polynomial; that is, an 
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element of F. Evaluating at c gives 

/(c) = e c (f(X)) = e c (X - c)e c (q(X)) + e c (r) = 0 q(c) + r = r. 

□ 

Notation. If a{X), b(X) e F[X] then l a(X)\b(Xy means ‘there exists 
q(X) g F[X] with b(X) — a(X)q(Xy. If this holds we say that a(X) is a 
factor of b(X). 

By Theorem 6.9, a(X) is a factor of b(X ) if and only if the remainder 
on dividing b(X) by a(X) is zero. 

As an immediate consequence of 6.10 we have the Factor Theorem, 
which provides an easy method for determining whether or not a given poly¬ 
nomial of degree 1 is a factor of f(X). 

6.11 The Factor Theorem If f(X) e F[X] then X - c is a factor of 
f(X) if and only if /(c) = 0. 

Proof. Since ( X — c)\f(X) if and only if the remainder on dividing f(X) 
by X — c is zero, 6.10 yields that ( X — c)\f(X) if and only if /(c) =0. □ 


§6g The Euclidean Algorithm 

Throughout this section, F will be a held. 

6.12 Definition (i) Two polynomials f(X) and g(X) in F[X] are said to 
be associates if f(X) — cg(X) for some nonzero c e F. 

(ii) A polynomial f(X) e F[X] is said to be monic if it is nonzero and has 
leading coefficient 1. 

Comment »t> 

6.12.1 Obviously for any nonzero polynomial f(X) there is a unique 
monic polynomial which is an associate of f(X) —namely, a~ 1 /(A), where a 
is the leading coefficient of f(X). »> 
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6.13 Proposition Nonzero polynomials f(X) and g(X) in F\X\ are asso¬ 
ciates if and only if f(X)\g(X) and g(X)\f(X). 

Proof. If / and g are associates then for some cgfwe have f — eg and 
g — c -1 /, so that g\f and f\g. Conversely, assume that f\g and g\f. Then 
f — Qi 9 an d 9 — 92 ! for some q\, q 2 G F[X], both of which are nonzero since 
/ and g are. Thus by Theorem 6.6 (i) we have 

deg(/) = deg(gi) + deg(g) > deg(^) = deg (q 2 ) + deg(/). 

Hence deg (<72) = 0, and therefore q 2 is a nonzero element of F, showing that 
/ and g are associates. □ 

6.14 Theorem If a(X) and b(X) are polynomials in F[X] which are not 
both zero then there exists a unique monic polynomial d(X) E F[X] such 
that both the following conditions are satisfied: 

(i) d(X)\a{X) andd(X)\b(X). 

(ii) If c(X)\a(X) andc(X)\b(X) then c{X)\d{X). 

Moreover, there exist m(X), n(X) G F[X] with 

d(X) = m(X)a(X) + n(X)b{X). 

Proof. Let a(X) and b(X) be elements of F[X] which are not both zero. 
We first prove the existence of a d(X) with the required properties. 

Define S to be the set of all nonzero polynomials p(X) in F[X] such 
that p(X) — m(X)a(X) + n(X)b(X) for some m(X), n(X) G F[X], and 
observe that S 7 ^ 0 since it must contain either a(X) or b(X). Hence the set 
of nonnegative integers 

K = {deg(p(X))\p(X)eS} 

is nonempty, and must therefore contain a least element, k. Let d{X) be an 
element of S which is monic and has degree k. (By 6 . 12.1 we can choose 
d(X) to be monic, since associates of elements of S are also in S.) 

Since d(X) G S the definition of S yields the existence m(X) and 
n(X) with d(X) — m(X)a(X) + n(X)b(X), and from this it follows that 
if c(X)\a(X) and c(W)| 6 (W) then c{X)\(m(X)a(X) + n(X)b{X)) = d(X). 
Thus we have established two of the properties of d(X) and have only to 
prove that d(X)\a(X) and d(X)\b(X). 

Suppose that d(X)/a(X), and let r(X) be the remainder on division 
of a(X) by d(X). Then r(X) ^ 0, and since r(X) — a(X) — q(X)d(X) for 
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some q(X), we obtain 


r(X) = a(X) - q(X)(m(X)a(X) + n(X)b(X)) 
= (1 - q(X)m(X))a(X) - q(X)b(X) 


so that r(X) E S. But this contradicts the definition of k. since the degree 
of r(X) is less than deg(d(X)) = k. Thus d(X)\a(X) and, by a similar 
argument, d(X)\b(X) also. 

It remains to prove uniqueness. So, let d± and c/ 2 be monic polynomials 
such that conditions (i) and (ii) are satisfied with d replaced by di and also 
with d replaced by d 2 - By (i) for d\ and (ii) for d 2 it follows that di\d 2 , and 
by (i) for d 2 and (ii) for d\ it follows that c^Mi. By 6.13 we deduce that d\ 
and c ?2 are associates of each other, and hence, by 6.12.1, d\ — d 2 - □ 

Comment »> 

6.14.1 The polynomial d(X) in 6.14 is called the greatest common divisor 
of a(X) and b(X). »> 

As in the case of integers, the greatest common divisor of two polyno¬ 
mials can be calculated by use of the Euclidean Algorithm (which is almost 
exactly the same for polynomials as integers): 

Given a, b E F[X] with a ^ 0, & ^ 0 and deg(a) > deg(6) (or b — 0, a ^ 0), 


while b yk 0 do 

[a, b] [b,a — b * (adiv&)] 


enddo 

a leading coefficient of a 


end 


At the end of the process, a is the gcd of the initial two polynomials. 

Alternatively, let aq, oq be the initial polynomials, and define 03, aq, 


by 


ai = q ^2 + a 3 deg(o 3 ) < deg(a 2 ) 

a 2 = ^403 + «4 deg(a 4 ) < deg(a 3 ) 


«fc-2 — QkO-k-i + Ofc 
Gfc-l = qk+lCLk 


deg(a fc ) < deg(a fc _i) 
(0^+1 = 0 ). 
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The algorithm must terminate eventually since the remainder on di¬ 
viding a t _i by a* is either zero or a polynomial of degree strictly less than 
that of Oj. Since the degree of a nonzero polynomial is always a nonnega¬ 
tive integer, and it is impossible to have an infinite decreasing sequence of 
nonnegative integers, it must eventually happen that we get a remainder of 
zero. (For instance, if deg(aq) = 0 then we will certainly find that a l+ \ = 0; 
a polynomial of degree 0 is always a divisor of any other polynomial.) 

As for integers, the set of common divisors of « t _i and a, remains 
unchanged throughout the algorithm, and hence 

gcd(ai, a 2 ) = gcd(a 2 , o 3 ) = • • • = gcd(a fc , a k+1 ). 

But gcd(afc, afc+i) = gcd(ofe,0), which is the unique mouic associate of a k - 
(The greatest common divisor is always monic, by definition; so, for instance, 
gcd(2X + 3, 0) = X + |.) Thus we conclude that the gcd of oq and a 2 is 
the unique monic associate of the last nonzero remainder obtained in the 
Euclidean Algorithm. 

- Example - 

#1 Find m, n G R[X] such that 

(*) m(X)(X 3 + X - 1) + n(X)(X 2 + 4) = 1. 

^—t> By division we find 

X 3 + X — l = X(X 2 + 4) + (-3X - 1) 

X 2 + 4 = (-iX + i)(-3X-l) + f. 

(The Euclidean Algorithm terminates at the next step, since ^ | —3X — 1.) 
So we have 

f = (X 2 + 4) + (iX-i)(-3X-l) 

= (V 2 + 4) + (§X - |)[(X 3 + X - 1) - X(X 2 + 4)] 

= [1 - X(lX - I))(X 2 + 4) + (iX - i)(X 3 + X-1) 

= (-IX 2 + |X + 1)(X 2 + 4) + (IX - |)(X 3 + X - 1). 

Thus 

1 = (-;|X 2 + ^x + A)(X 2 + 4) + (ll-Ip 3 + X- 1). 
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So if we define 

m 0 (X) = X X -± 
n„[X) = -§.X 2 + X X + X 

then rri(X) = mo(X), n(X) = no(X) is a solution to (*). 

Notice that the solution is not unique. In fact, for any p(X ) G M[X] we 
can obtain another solution by putting 

m(X) = m 0 (X) + p(X) (X 2 + 4) 
n(X) = n 0 (X) - p(X) (X 3 + X-l). 

<J—< 


§6h Irreducible polynomials 

Let F be a field and let p G F[X], Then for any nonzero c G F the equation 
p{X) = c(c -1 p(X)) shows that c is a divisor of p. Similarly all associates of 
p are divisors of p. Polynomials which have only these trivial divisors are of 
considerable theoretical importance. 

6.15 Definition A polynomial p G F[X] is said to be irreducible (or 
prime) if deg(p) > 1 and the only divisors of p in F[X] are polynomials 
of degree 0 and associates of p. 

Comments >» 

6.15.1 If p is irreducible and p(X) = di(X)d 2 (X) then either d\ is an 
associate of p. in which case deg(c? 2 ) = 0, or deg(di) = 0, in which case ^2 is 
an associate of p. 

6.15.2 If p is reducible (that is, not irreducible) and deg(p) > 1 then p 
has a divisor d\ satisfying 

(i) deg(di) > 1 

(ii) di is not an associate of p. 

Since d\ is a divisor of p we have p(X) = di(X)d 2 (X) for some d 2 , and 
(ii) above implies that deg(d 2 ) > 1. This combined with (i) above and the 
equation 

deg(p) = deg(di) + deg(d 2 ) 
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yields that 


for i — 1 and i = 2 . 


1 < deg (di) < deg(p) - 1 


6.15.3 If deg(p) = 1 then it follows from 6.15.2 above that p is irreducible. 
For if p were reducible we could find d\ and g ?2 with p(X) — di(X)d 2 P0 
and 1 < deg(di) < deg(p) — 1 (for i — 1, 2). But this is impossible since 
deg(p) —1 = 0. (The point is that if neither d\ nor c ?2 is a constant polynomial 
then deg(p) = deg(di) + deg(d 2 ) >1 + 1 = 2 .) 

6.15.4 If deg(p) is 2 or 3 and p is reducible then p has a zero in F. For 
it follows from 6.15.2 that p(X) = di(X)d 2 (X) with 

deg(di) + deg(d 2 ) = deg(p) (= 2 or 3) 

and 

deg (di) >1 for i — 1 and % = 2 . 

Now if both deg(di) > 2 and deg(d 2 ) > 2 then deg(di) + deg(d 2 ) > 4, 
contradiction. So either d\ or d /2 has degree 1 . So p has a factor of the form 
aX + b with a, b e F and a ^ 0. Thus, for some d G F[X], 


p(X) = (aX + b)d(X) 

— a[X — (—a~ 1 b)^d(X). 

By the Factor Theorem, —a~ 1 b is a zero of p(X). 


>» 


§6i Some examples 

It will be convenient in the future to denote the elements of 7L n by ‘O’, ‘F, ‘2’, 

... and so on, instead of ‘O’, ‘I’, ‘2’, ..., since it becomes rather clumsy to 
have bars over all the coefficients when dealing with polynomials over Z n . Of 
course we will have to be careful to remember things like 7=2 in Z5, — 1 = +1 
in Z 2 , and so on. 

9^2 In Z3 [X] the polynomial p(X) — X 2 — X — 1 is irreducible. For by 

6.15.4 above, if X 2 — X — 1 were reducible it would have a zero in Z 3 . But 


p ( 0 ) = —1 = 2^0 

P(l) = -1 = 2^0 
P( 2) = 1^0, 
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and since 0 , 1 , 2 are the only elements of Z 3 we see that p(X) has no zeros 

in Z 3 . 

#3 In M[X] the polynomial X 2 — X — 1 is reducible. Indeed 

X 2 -X - 1 = {X-a)(X-/3) 
where a = 1+ 2 V ^ and (3 — 1 s/ 5 . 

^64 In M[X] the polynomial X 2 + 1 is irreducible. For if it were reducible 
it would have a factor of degree 1, and hence a zero in M. But a 2 + 1 ^ 0 for 
all a G R. 

#5 In C[X] the polynomial X 2 + 1 is reducible. Indeed we have the 
factorization X 2 + 1 = (X — i)(X + i). 

#6 In M[X] we have X 2 — 3 = (X — \/3)(X + \/3), and so X 2 — 3 is 
reducible. But in Q[A] the polynomial X 2 — 3 is irreducible, since it has no 
zeros in Q. (There is no rational number a such that a 2 — 3 — 0.) 

#7 Polynomials of degree four or more may have no zeros and yet be 
reducible. For instance, X 4 + X 2 + 1 has no zeros in M but is nevertheless a 
reducible element of M[X], In fact X 4 + X 2 + 1 = ( X 2 +X + 1)(X 2 — X + 1). 

#8 Irreducibles in C[X] 

The “Fundamental Theorem of Algebra” states that every polynomial p in 
C[X] of degree at least one has a zero in C. By the Factor Theorem it 
follows that p(X) — (X — c)q(X) for some c G C and q(X) G C[A], So if p 
is irreducible the degree of q must be zero, making X — c an associate of p. 
It follows that the only irreducible polynomials in C[X] are the polynomials 
of degree one. 

#9 Irreducibles in M[X] 

Suppose that p G M[X], p is irreducible, and deg(p) > 1. Since p has no 
factors of degree 1 in R[A] it has no zeros in M. But by the Fundamental 
Theorem of Algebra p(X) has a zero a + bi in C. We must have b ^ 0 since 
this zero is not in M. Observe that a + bi is also a zero of the polynomial 
X 2 — 2 aX + a 2 + b 2 G R[X]. By Theorem 6.9 


p(X) = q(X) (X 2 - 2 aX + a 2 + b 2 ) + (r 0 + r x X) 
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for some ro, r\ G M. Substituting X — a+ bi gives 

0 = p(a + bi) = q(a + hi) 0 + (ro + ri(a + bi)) = (r o + ria) + (r\b)i. 

Equating real and imaginary parts gives r\b = 0 and ro + r\a — 0. Since 
b 0 this gives tt = 0, and hence ro = 0. Thus X 2 — 2aX + a 2 + b 2 is 
a factor of p(X), and since p(X) is irreducible it must be an associate of 
X 2 — 2 aX + a 2 + b 2 . Hence all irreducibles in M[X] are of degree 1 or 2. 

#10 In Q[X] there are irreducibles of all degrees. In fact Eisenstein’s Cri¬ 
terion, to be proved in § 6 k below, shows that X n — 2 is irreducible in Q[X] 
for all n G Z. 


§6j Factorization of polynomials 

The proofs of the following facts are very similar to the corresponding proofs 
for Z, and are omitted. 

6.16 Theorem Let a, b, p be polynomials over the held F, and suppose 
that p is irreducible and p\ab. Then p\a or p\b. 

6.17 LEMMA Suppose that p, q\, q 2 , ■ ■ ■ q s are monic irreducible polynomi¬ 
als in F[X] and that for some nonzero d G F we have 

p(X)\d qi (X)q 2 (X)...q s (X). 

Then p(X) = q 3 (X) for some j. 

6.18 Unique Factorization Theorem (i) Suppose that /(X) is a poly¬ 
nomial of degree greater than one with coefficients in the held F , and let c be 
the leading coefficient of f. Then there exist monic irreducible polynomials 
pi(X), p 2 (X), ..., p r (X) in F[X] such that 

f(X)=c Pl (X)p 2 (X)...p r (X). 

(ii) If cpi(X)p 2 (X)... p r {X) = dqi(X)q 2 {X)... q s (X ) where c, d are nonzero 
elements of F and the pi, qj are monic irreducible polynomials, then c — d, 
r — s, and 

Pi = q tl , P 2 = qi 2 , ■ ■ ■, Pr = qy 
where i\, i 2 , .. . i r are the numbers 1 , 2 , ... r in some order. 
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- Examples - 

#11 Since Z 17 is a field (because 17 is prime) the Unique Factorization 
Theorem holds in So, for instance, X 2 — 6 X + 5 = (X — 1)(X — 5), 

and this is the unique way of writing X 2 — 6 X + 5 as a product of irreducibles. 
On the other hand, Zi 6 is not a field, and in Zi 6 [X] we find that 

(X - 1)(X - 5) = X 2 - 6 X + 5 
= X 2 + 10X + 21 
= (X + 7)(X + 3). 

Unique factorization does not hold in Z i 6 [X]. 

#12 In Z 2 [X] there are eight polynomials of degree three. We list them all 
and express each as a product of irreducibles: 

X 3 = XXX 

x 3 + 1 = (X + 1 )(X 2 + X + 1 ) 

X 3 +X = X(X + 1)(X + 1) 

X 3 + X + 1 is irreducible 
X 3 + X 2 = XX(X + 1) 

X 3 + X ' 2 + 1 is irreducible 
X 3 + X 2 + X = X(X 2 + X + 1) 

X 3 + X 2 + X + 1 = (X + 1)(X + 1)(X + 1). 


§6k Irreducibility over the rationals 

6.19 Proposition Suppose that /(X), g(X) e Z[X] and p G Z is a prime 
integer which divides all the coefficients of f(X)g(X). Then either p divides 
all the coefficients of /(X) or all the coefficients of g(X). 

Proof. Let 

/(X) — ao 01 X + • • ■ + o n X n 
g(X) = b 0 + b 1 X + --- + b rn X m . 


and define 
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<1>{X) = ah + aiX + • ■ ■ + cdX n G Z P [X\ 

7 (I)^ + ^I + -+^I m 6Z p [I] 

where we have reverted to the bar notation for elements of Z p to avoid con¬ 
fusion. Then 


f (X)g(X) — "T (ciobi + aibo)X + (ao &2 + o>\b\ + a 2 bo)X‘ 2 + • • • 

and by a similar calculation 

(j)(X)'y(X) — gq 6o T (oo b\ + a\ b^)X + (gq 62 + <xi b\ + <3.2 bo)X 2 + ■ ■ ■ 

= Oo^O + (flo^l ~t~ a!b 0 )X + (fto^2 + ttl&l + O,2^o)^ 2 + • • • 

= 0 

since all the coefficients of f(X)g(X) are divisible by p. But Z p is an integral 
domain (by Theorem 4.10) and so Z P [X] is also an integral domain (by Theo¬ 
rem 6 . 6 ), and therefore has no zero divisors. It follows that either (j)(X) — 0, 
in which case oq ; ai, ... are all zero, and ao, a 1 , ... are all divisible by p, or 
'y(X) = 0, in which case all the coefficients of g(X) are divisible by p. □ 

6.20 Gauss’ Lemma Suppose that a(X) e Q[X] is reducible and has all 
its coefficients in Z. Then a(X) has a nontrivial factorization in Z[X]. 

Proof. Let a(X) = f(X)g(X), where f(X), g(X) G Q[-X] and both / and 
g have degree less than the degree of a. Observe that there exist integers 
m and n such that all the coefficients of mf{X) and ng(X) lie in Z—for 
instance, if f(X) — (ro/so) + (ri/si)X + ■ • • + ( rd/sd)X d with the r t and 
Si in Z, then taking m = so^i... would suffice. Hence if k — mn the 
following property is satisfied: 

There exist polynomials /1 and <71 which have 
(P) integral coefficients and are associates of / and 

g respectively, such that ka(X ) = fi(X)g\(X). 

Let K be the set of all positive integers k for which (P) is satisfied. 

Since K is nonempty it has a least element, h. It suffices to prove that 
h = 1, for then (P) shows that a(X) has a factorization of the required kind. 
So, suppose that h > 1. By Theorem 3.7 there exists a prime p which is a fac¬ 
tor of h. and since the coefficients of a(X) are integral it follows that the coef¬ 
ficients of ha(X) are all divisible by p. Since h e K there exist f\. g\ G Z[X\ 
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with deg(/i) = deg(/), deg(firi) = deg(p) and ha(X) = f 1 (X)g 1 (X), and by 
Proposition 6.19 either (l/p)/i has integral coefficients or (1/p) (71 does. It 
follows that ( h/p ) is in K, since 

(h/p)a(X) = ((1 /p)MX)) gi (X) = /i(X)((l /p)si(X)). 

This contradicts the fact that h is the smallest element in K , proving that 
h — 1 , as required. □ 

As a corollary of Theorem 6.20 we obtain an easy way of listing all 
rational numbers which can possibly be roots of a given integral polynomial. 

6.21 Rational Roots Theorem If a 0 ± a\X H -± a d X d e I[X\ then 

all zeros of a(X) in Q have the form ±(m/n) where m and n are integers 
such that m|ao and n\a d . 

Proof. If a(X ) has a zero in Q it has a linear factor m — nX in Q[X]. Thus 
there is a factorization 

(m — nX)(bo ± b\X ± • • • ± b d —\X d 1 ) = ao + a\X + • • • + a d X d 

and by Theorem 6.20 we may assume that all the coefficients of both factors 
are integral. But since ao = m 6 0 and a d — —nb d _ 1 we deduce that m|ao 
and n|ad- This proves the claim, since the zero corresponding to the linear 
factor m — nX is (m/n). (The ± appears in the theorem statement merely 
to emphasize that m and n may be negative.) □ 

- Example - 

#13 By Theorem 6.20 the only rational numbers which can be zeros of 
3 — 13X — 7X 2 + 2X 3 are ±1, ±3, ±(1/2), ±(3/2). Trying them all one finds 
that in fact —(3/2) is the only rational zero. 


6.22 Eisenstein’s Irreducibility Criterion If p is a prime integer and 
a(X) — ao ± aiX ± • • • ± a d X d e 7L[X\ is such that p\ai for all i ^ d, p/a d 
and p 2 /ao, then a(X ) is irreducible in Q[A], 

Proof. Suppose that there is a nontrivial factorization of a(X ) in Q[X]: 
(*) ao ± a\X ± • • • ± a d X d — (6 q ± b\X ± • • • ± b r X r ) (cq ± c\X ± • • • ± c s X s ) 
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where r > 1, s > 1 and r + s — d. By 6.20 we may assume that bi , Ci e Z. 
On expanding (*) we obtain the equations 


a 0 — boco 

ai = b 0 c i + 6ic 0 

a 2 — b()C2 + 6lCi + & 2 C 0 

and in Z p we therefore have similar equations 

do — bo Co 

di = boci + boci 

d2 = bo 02 + bxci + b 2 cd 


which in Z p yield the factorization 

(**) ao + aqX H-+ a^ X d — (bo + b\X + • ■ ■ -\-b r X r ) (cq + c\ X + ■ • • + c s X s ). 


But ad = a± — ■■■ = a^-i = 0 and ad ^ 0; so the left hand side above 
is an associate of X d . But Z p is a held, and so the Unique Factorizat' 
Theorem 6.18 applies to Z p [X]. The only monic irreducible factor of 
is X] so it follows that the factors in (**) are associates of powers of 
Moreover, the right hand side of (**) cannot have degree less than d — r + s; 
so b r ^ 0 and cd ^ 0. Thus 


bo + hX + • • • + b r X r = XX r 
cd+cdX + ■■■ + cdX s = pX s 

for some A, p G Z p . Since r > 1 and s > 1 it follows that b 0 = 0 and cd = 0. 
Thus for some integers h and k we have bo — ph and Co = pk , giving 

ao = boco — p 2 hk 


contrary to the assumption that p 2 /oo- 


□ 


* * 8 
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For example, application of Eisenstein’s Criterion with p — 3 shows 
that X 4 — 9X 2 — 36X — 33 is irreducible over Q, but gives no information 
about X 4 — 9X 2 — 36X — 36. 


- Examples - 

#14 Prove that #2 is irrational. 

^—> By Eisenstein’s Criterion with p — 2 we see that X 3 — 2 is irreducible 
in Q[X] and so, by the Factor Theorem, it has no zeros in Q. Hence #2 ^ Q. 

Alternatively, by the Rational Roots Theorem the only possible rational 
roots of X 3 — 2 are ±1 and ±2. Since #2 is a root, it is not rational. <— 

#15 Let a, 6 , c be rational numbers which are not all zero, and let 

t — a + 6\/2 + c ^\^j . 

Prove that t # 0. 

^—> If c and b are both zero then a cannot be zero, and t = a # 0. If c = 0 
and 6 # 0 then t — 0 would give \[2 = —(a/ 6 ), contradicting the irrationality 
of #2. Thus we may assume that c / 0. 

Let p(X) — a + bX + cX 2 , and let r(X) be the remainder obtained 
when X 3 — 2 is divided by p(X). Thus 

($) X 3 — 2 — p(X)q(X) + r(X) 

for some q(X) G Q[X], and by Theorem 6.9 r(X) = c+dX for some c, d G Q. 
If t — 0 then substituting X = \/2 in ($) gives r(\/ 2) = 0, and this contradicts 
the irrationality of \/2 unless c — d — 0. Hence X 3 — 2 is divisible by p{X). 
But this contradicts the fact that X 3 — 2 is irreducible in Q[X], proved in 
#14 above. Hence t # 0. <—-C 


Exercises 

1 . Find the greatest common divisor d(X) of the polynomials 


and 


/(X) = X 3 — 6X 2 + X + 4 
g(X) = X 5 - 6X + 1 
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in Q[X]. as well as polynomials s(X) and t(X) such that 


d(X)=s(X)f(X) + t(X)g(X) 


2. Find the greatest common divisor d(X ) of 

f(X ) = X 5 6 7 8 9 10 + X 4 + 2X 3 - X 2 - X - 2 

and 

g(X) = 2X 4 + 4X 3 + 3X + 3 

where /(X), g(X) £ Z 5 [X]. Also find polynomials s(X), t(X) £ Z 5 [X] 
such that 

d(X) = s(X)f(X)+t(X)g(X). 

3. Write X 4 + 2X 3 + X 2 + 2X + 2 over Z 3 as a product of irreducible 
polynomials. 

4. Let /(X) = ao + aiX H-|-a n _iX n_1 +X n £ Q[X] be a monic polyno¬ 

mial with integral coefficients. Show that if ad + aiX + ■ ■ ■ + X n £ Z P [X] 
is irreducible (for some prime p ), then /(X) is irreducible over Q. 

5. List all the polynomials of degree 4 in Z, 2 [X], and express them as prod¬ 
ucts of irreducibles. 

6 . Test the following polynomials for irreducibility over Q: 

(i) 5X 3 + X 2 + X — 4 

(ii) 3X 5 + 2X 3 — 6 X + 6 

(Hi) X 4 + 7X 3 — 10X 2 + 2X + 5 (Hint: Apply Exercise 4 with p — 2.) 

(iv) 2X 4 + 7X 3 - 15X 2 + 3X + 6 

7. Give an example of a quadratic polynomial in Zq[X] which has more 
than two roots. 

8 . Prove that \/6 is irrational. 

9. Find all monic irreducible quadratic polynomials over Z 5 . 

10 . Let R[X] be a commutative ring with 1. Prove that the polynomial ring 
A[X] cannot be a field. 

(Hint: Show that X does not have an inverse.) 
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11 . Let f(X ) and g(X) be monic polynomials of degree 5 in Zy[X] with the 
property that /(c) = g(c) for all c e Z 7 . Show that f{X) — g(X). 

12 . Let F be a field and f(X), g(X) and h(X) distinct monic polynomials 
in F[X] with f{X) and g(X) irreducible. Show that the gcd of h(X) 
and f(X)g(X) is 1 if h(X) is irreducible, but need not be 1 otherwise. 

13. Prove that —■» Mat(3, M) defined by 

( Go ai tt2 \ 

0 Q-o a, 1 I 
0 0 a 0 ) 

is a homomorphism. 



7 

More Ring Theory 


In this chapter we return to the general setting and develop the remaining 
theory which will be needed in this course. Our principal objectives are the 
definition of quotient rings and the Fundamental Homomorphism Theorem. 


§7a More on homomorphisms 

It will be convenient for us to temporarily generalize slightly the concept of 
a homomorphism. Suppose that S and T are sets each equipped with oper¬ 
ations called addition and multiplication, about which nothing is assumed. 
(Thus, in particular, S and T need not be rings.) We still refer to a map 
9: S —> T satisfying 9(xy) — 9(x)6(y) and 0(x + y) — 9(x) + 6(y) as a homo¬ 
morphism. 

7.1 Lemma Let R be a ring and T any set equipped with addition and 
multiplication, and suppose that 6: R —> T is a homomorphism. Then 

S = { 9{x) | x G R } 

is a subset of T which forms a ring under the operations of T, and 

K={xeR | 9{x) = 0(0)} 

is an ideal of R. 

Proof. Let a, b, c G S. Then there exist x, y, z G R such that a = 9(x), 
b — 6(y) and c = 9(z), and the associativity of addition in R together with 
the fact that 6 preserves addition gives 

(a + b) + c — (0(x) + 9{y)) + 9{z) = 9{x + y) + 9{z) — 9{(x + y) + z) 

— 9 (x + (y + z)) = 9(x) + 9(y + z) = 6{x) + (9(y) + 9(z )) = a + (b + c). 

Similar arguments show that a + b = b + a, ( ab)c = a(bc ), a(b + c) — ab + ac 
and (a + b)c — ac + be. Hence S satisfies Axioms (i), (iv), (v) and (vi) of 
Definition 2.2. 
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The element 0(0) G S' is a zero element for S, since if a E S is arbitrary 
then (for some x E R) 


and 

Furthermore, since 
and 


a = 6(x) = 6(x + 0) = 6(x) + 0(0) = a + 0(0) 
a = 0(x) = 0(0 + x) = 0(0) + 0(x) = 0(0) + a. 


a + 0(—x) — 6(x) + 0(— x) = 0(x + (— x)) = 0(0) 
6(—x) + a — 9(—x) + 0(x) = 0((—x) + x) = 0(0) 


we see that a has a negative. Thus Axioms (ii) and (iii) are satisfied. 

To prove that K is an ideal we use Proposition 5.8. Since 0 G K we 
have that K ^ 0. Now if x, y E K and r E R then 

9(x + y) = 9(x) + 9(y ) = 0(0) + 0(0) = 0(0 + 0) = 0(0) 

9(—x) — 9(—x + 0) = 9(—x) + 0(0) = 9(—x) + 9(x) = 0 (—x + x) — 0(0) 
9(xr) — 9(x)9(r) — 0(O)0(r) = 0(Or) = 0(0) 

9(rx ) = 9(r)9(x) = 0(r)0(O) = 0(rO) = 0(0) 

so that x+y , —x, r.x and .xr are all in K. So all the required closure properties 
hold. □ 

Comment t»t> 

7.1.1 Since the element 0(0) of S is the zero of S the definition of K is, 
effectively, 

K = {xER | 9(x) = 0}. 

The set K is usually called the ‘kernel’ of 0. (The set S is called the ‘image’ 
of 0—see §0b.) >» 


7.2 DEFINITION If R and S are rings and 0: i? —> S' a homomorphism, then 
the kernel of 0 is the subset of R 

ker0 = 0 —1 (0g) = {x G R j 0(x) = 0g } 

where 0g is the zero of S. 


From Lemma 7.1 we have immediately: 
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7.3 THEOREM The kernel of a ring homomorphism 9: R —> S is an ideal of 
R, the image a subring of S. 

It is trivial that a homomorphism 9: R —> S' is surjective if and only if 
im 9 — S. The next result provides an analogous criterion for injectivity: 

7.4 THEOREM A ring homomorphism 9: R —> S is injective if and only if 
ker 9 = {0^}. 

Proof. Suppose that 9 is injective. It follows from Theorem 5.5 (i) that 
0 r G ker 0. Suppose that x is another element of ker 9. Then 

0(x) = 0 5 = 9( Or) 

and injectivity of 9 gives x — Or. Thus 0i? is the only element of ker0, as 
required. 

Conversely, suppose that ker 9 = {0^}, and let x and y be elements of 
R with 9(x) — 9{y). Then by Theorem 5.5 

9(x - y) = 9(x) - 9(y) = 0 5 

and therefore x — y G kerf?. Hence x — y — 0#, and x — y. Thus 9 is 
injective. □ 

7.5 Theorem If 9: R. —> S and ip: S —> T are ring homomorphisms, then 
the composite map ip9: R —> T defined by 

(ip9)(x) — ip(9(x)) for all x G R 

is also a homomorphism. 

The proof of this is left to the exercises. 

- Example - 

#1 Define /: Z[X] -»• Z by 


/(ao + a±X + • • • + a n X n ) — ao- 

Observe that / coincides with the evaluation map eo (see § 6 e): the result of 
putting X = 0 in ao + aiX + • • • + a n X n is ao- So / is a homomorphism. 
Now let g: Z —> Z 3 be the natural homomorphism (given by a w a for all 
a G Z (see §5b^7)). The composite map gf: 7L\X] Z 3 is given by 


a 0 


ao + aiX + • • • + a n X n 
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and by 7.5 it is a homomorphism. By 7.3 the kernel of this homomorphism 
is an ideal of Z[X\. It can be seen that this ideal consists of all polynomials 
over Z with constant term divisible by three. 


§7b More on ideals 

If R is any ring then the subsets {0} and R are ideals. For some rings these 
are the only ideals; for instance, fields have no ideals other than these trivial 
ones. 

Let I be an ideal of R , and suppose that a E I. Then I contains every 
element of the form ra for r E R. In particular, if R has an identity and a 
has an inverse then I contains t — (ta~ 1 )a for any t E R. This observation 
gives us the following theorem: 

7.6 Theorem (i) An ideal which contains an element with an inverse must 
be the whole ring. 

(ii) If F is a held then the only ideals in F are {0} and F. 

Proof. The first part is immediate from the preceding remarks, and the 
second part follows from the first since all nonzero elements of fields have 
inverses. □ 

Notation. If R is a commutative ring the set {ar \ r E R} will be denoted 
by l aR ’ or l Ra\ 

7.7 Theorem If R is a commutative ring and a E R then aR is an ideal 
of R. 

Proof. Since a0 E aR it is immediate that aR ^ 0. Now let x, y E aR and 
r E R. Then x — as, y — at for some s, t E R, and hence 

x + y — as + at — a(s + t)E aR 
—x = —(as) — a(—s) E aR 
xr — ( as)r = a(sr ) E aR. 


Since R is commutative we deduce also that rx — xr E aR. By Proposi¬ 
tion 5.8 it follows that aR is an ideal. □ 
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Comment »> 

7.7.1 The above proof uses commutativity of R , and it is impossible to 
avoid this. If R. is not commutative then aR is not necessarily an ideal. 

»> 


7.8 Definition Let R be a commutative ring with 1 and let a E R. Then 
aR is called the principal ideal generated by a. 

- Examples - 

#2 Let (j>: Z — > Z n be the homomorphism given by (f>(a ) = o (see §5b^7). 
The kernel of (f) must be an ideal of Z (by 7.3), and in fact 

ker (j) — {a E Z \ a — 0} 

— {a G Z | a = 0 (mod n) } 

— {a EZ \ a — nk for some k E Z } 

= nZ. 

That is, ker^> is the principal ideal generated by n. 

#3 Define cr:M[W] —» M. by a(/(X)) = /(3) for all f E M[X], (That is, in 
the notation of § 6 e, a — e 3 .) Then 

ker a = {f(X) | /(3) = 0} 

= { f(X) | X — 3 is a factor of f(X) } (by Theorem 6.11) 
= {(X- 3 )g(X) I g E M[X] } 

= (X-3)M[X], 

That is, ker a is the principal ideal generated by X — 3. 

#4 There are ideals which are not principal. We saw in #1 that the subset 
I of Z[X\ consisting of those polynomials which have constant term divisible 
by three is an ideal of Z[X\. However, / is not a principal ideal: there is no 
p E Z[X] such that / = p(X)Z\X). For suppose that such a p exists. Then 
since 3 G I we have 3 = p(X)q(X) for some q E Z[X\. By Theorem 6.6 

deg(p) + deg (q) = deg (3) = 0 

and so deg(p) = deg(g) =0. So p and q are constant polynomials; that 
is, p and q are elements of Z (regarding Z as a subring of Z[X \—see 6.7.1). 
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Since 3 is the product of p and q and since 3 is prime we see that the only 
possibilities for p are ±1 and ±3. But if p = ±1 then pZ[X] = Z[X] ^ /, 
and if p — ±3 then p7L[X] = 3Z[X] consists of those polynomials for which 
every coefficient is divisible by three, not just the 0 th . So our assumption 
that ph[X\ — I is contradicted, showing that no such p can exist. 

#5 There is a homomorphism p:M[X] —> C given by 

p[p(X)) = p(i) for all p E M[X], 

By Theorem 6.9, for any p E M[X] there exists q E M[X] and a, b E M with 

p(X) = q{X)(X 2 + l) +bX + a, 

and putting X — i we get p(i) — bi + a. So if p(i) — 0 we must have both 
a — 0 and b — 0, in which case X 2 + 1 is a factor of p(X). Thus the kernel 
of p is the set of all p which have X 2 + 1 as a factor: 

kerp = (X 2 + 1)M[X], 


§7c Congruence modulo an ideal 

If n is a positive integer then, as we have seen, the set nTL of all integers 
divisible by n is an ideal of Z, and on Z there is an equivalence relation 
‘congruence modulo n’ given by 

a = b (mod n ) if and only if a — b E nTL. 

The same works for any ideal in any ring. 

7.9 Theorem Let I be an ideal in a ring R, and define a relation on R by 
a = b (mod I) if and only if a — b E I. 

The relation so obtained is an equivalence relation. 

Proof. For all x E R we have x — x — 0 El, since I is a subring and must 
therefore contain the zero of R (by 5.2.1). Hence x = x (mod /), and the 
Reflexive Law is satisfied. 
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Suppose that x, y E I and x = y. Then x — y G /, and so 

y — x = —(x — y) G I 

by Proposition 5.8. Thus y = x, and the Symmetric Law is satisfied. 

Finally, suppose that x, y, z G R and x = y and y = z. Then x — y G I 
and y — z G /, and so 

x — z — (x — y) + (y — z) G / 

by Proposition 5.8. Thus x = z. and the Transitive Law holds. □ 

The relation defined in 7.9 is called congruence modulo I. By the results 
of §4a we see that it partitions R into equivalence classes. These equivalence 
classes are called the cosets of I. Reformulating this slightly gives the fol¬ 
lowing: 

7.10 DEFINITION If I is an ideal in the ring R and a e R then the coset of 
I containing a is the set/ + a = {& |& G-R and a — b E I }. 

Comment »> 

7.10.1 The notation T+a’ derives from the fact that the coset containing 
a is alternatively described as the set{x + a|xe/}. »> 

Occasionally we will use the same notation as we used in §4a for equiv¬ 
alence classes, and write ‘o’ for T + a’. The advantage of the bar notation 
is that it is shorter, the disadvantage that it suppresses any mention of /, so 
that the reader has to remember which ideal is being used. 


§7d Quotient rings 

If I is an ideal in the ring R define 

R j I = { I a \ a (E R}. 

In other words, Rj 1 is the set of all equivalence classes of R under the relation 
of congruence modulo I. That is, in accordance with Definition 4.3, Rjl is 
the quotient of R by this equivalence relation. We wish to define operations 
of addition and multiplication on R/l to make Rjl into a ring. We do this 
in exactly the same way as we did it for Z n . 
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7.11 Theorem Let I be an ideal in the ring R. Then there exist well- 
defined operations of addition and multiplication on R/l such that 

(/ + a) + (/ + 6) = I + (a + h) 

(/ + a)(J + h) =I + ab 


for all a, b G R. 

Proof. Since every element of R/l lias the form I + a for some a e It. 
the given equations define the sum and product of every pair of elements 
of R/l. The problem is that since elements of It jI may be expressible in 
this form in several ways the equations may be inconsistent. We must prove, 
therefore, that if I + a' — I + a and I + b' — I + b then I + a'b' — I + ab and 
I + (a' + b') — I + (a + b). 

Assume that I + a' = I + a and I + b' = I + b. Then a = a' and b = b' 
(mod /), and so a' = a + x, b' = b + y for some x, y G I. Now 

a 1 + b' = (a + x) + (b + y) = (a + b) + (x + y) = a + b (mod I) 

a'b' — (a + x)(b + y) — ab + ( xb + ay + xy) = ab (mod I) 

since by 5.8 the fact that x, y G / gives that x + y, xb , ay, xy and hence 

xb + ay + xy are all in I. Thus I + (a' + b') — I + (a + b) and / + a'b' = I + ab, 

as required. □ 

Comment t>t» 

7.11.1 By these definitions, to add or multiply two cosets one picks ele¬ 
ments in the cosets and adds or multiplies the elements. The theorem shows 
that the coset containing the result is independent of the elements chosen. 

»> 


7.12 Theorem Let I be an ideal in the ring R. Then Rjl is a ring under 
the operations of addition and multiplication defined in 7.11. Furthermore, 
the map u: R —> Rjl defined by iAa) — I + a (for all a G R) is a surjective 
homomorphism. 

Proof. The definitions of addition and multiplication yield immediately 
that 


u{a)n(b) — (I + a) (I + b) — I + ab — u(ab) 
u(a) + iy(b) — (I + a) + (I + b) — I + (a+ b) — u(a + b) 
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and therefore v is a homomorphism. It is trivial that v is surjective, since 
every element of R/1 has the form I + a with a G R. Since by Lemma 7.1 
the image of v is a ring we deduce also that Rfl is a ring. □ 


7.13 Definition The map v.R —► Rfl defined in Theorem 7.12 is called 
the natural homomorphism from R to the quotient ring Rfl. 

- Examples - 

#6 If R — Z and I — nZ (where n G Z + ) then the cosets nZ + a (a G Z) 
are exactly the congruence classes o (o G Z) as defined in §4b, and the 
quotient ring Z/nZ is exactly the same as the ring Z n . 

#7 Let R — M[X] and I = (X 2 + 1)M[X], As we have seen (#5 above), 
for each p G M[A] there exists q G M[X] and a, b G R with 


p(X) = (X 2 + l)q(X) + hX + a. 

This gives 

p{X) = bX + a (mod I) 


and shows that every equivalence class contains a polynomial of the form 
bX + a. Thus 

Rfl = { I + {bX + o) | a, b G R }. 

Observe that 

(I + bX + a)(I + dX + c) = I + (bX + a){dX + c) 

= I + (bdX 2 + ( ad + bc)X + ac) 

= I + bd(X 2 + 1) + (ad + bc)X + (ac — bd). 

But since 

bd(X 2 + 1) + (ad + bc)X + (ac — bd) 


is congruent to 

(ad + bc)X + (ac — bd) 

modulo the ideal / = (X 2 + 1)M[X], this gives 

(4b) (I + (a + bX)) (/ + (c + dX)) =1+ ((ac - bd) + (ad + bc)X). 
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We also have 

( < v ) ) (I + (a + WQ) + (i + (c + dX)') = / + ((a + c) + (6 + d)X ). 

Comparing (4&) and ( ( v ) ) with the rules for multiplication and addition of 
complex numbers, 


(a + bi)(c + di ) = (ac — bd) + (ad + bc)i 
(a + bi ) + (c + di) = (a + c) + (b + d)i 

we see readily that the ring R/l = M[X]/ (X 2 + 1)M[X] is isomorphic to C. 


From an intuitive point of view the construction of a quotient ring 
R/l amounts to regarding all elements of I as being equal to zero. As a 
consequence of this we must regard two elements as equal if they differ by an 
element of /; that is, if they are in the same coset. This process is sometimes 
called “factoring out /”. 


Te The Fundamental Homomorphism Theorem 


7.14 The Fundamental FIomomorphism Theorem Let R and S be 
rings and let 6 : R —» S be a homomorphism. Then 

R/ ker d = im 6 . 

Indeed there is an isomorphism 


satisfying 


■0: i?/ker 6 —> im 6 
V>(ker 6 + a) = 9(a) 


for all a E R. 

Proof. Since every element of A/kerd is expressible in the form ker 0 + a, 
possibly in more than one way, the formula -0(kerd + a) — 6 (a) will de¬ 
fine a function from R to A/kerd provided that it is consistent with itself. 



106 Chapter Seven: More Ring Theory 


Thus we must show that if ker# + a' — ker# + a then 9 (a') — 9(a). But if 
ker 9 + a' — ker 9 + a then a' — a E ker 9 , and hence, by 5.5, 

9(a) — 9(a) — 9(a — a) — 0 

giving the result. So ip is well defined. 

If a, (3 E i?,/ker 9 then there exist a, b E R such that a = ker# + a and 
(3 — ker 9 + 6, and we obtain 

■ ip(a)ip((3 ) = ip(ker 9 + a)ip( ker 9 + b) 

= 9(a)9(h) 

= 9(ah) (since 9 preserves multiplication) 

= ip {ker 9 + ah) 

— ip((ker9 + a)(ker# + b)) 

= fJ}(aP). 

A similar argument based on the fact that 9 preserves addition shows that 
ip(a) +'ip(/3) — ip(a + P)- Thus i/j is a homomorphism, and it remains to show 
that ip is bijective. 

Let x E im#. Then x — 9(a) for some a E R, and from this it follows 
that 'll:(ker #+a) = 9(a) — x. Hence ip is a surjective mapping from Rj ker 9 to 
im 9. Now suppose that ip(ot) = ip((3) for some a and j3 in i?,/ker 9. Choosing 
a and b in R with a = ker 9 + a and (3 — ker 9 + b we find that 

9 (a — b) = 9(a) — 9(b) = ip (a) — ip(/3) — 0 

and therefore a — b E ker#. That is, a = b (mod ker#), and 

a — ker 9 + a — ker # + b — (3. 

Therefore ip is injective. □ 

Comment »t> 

7.14.1 If #: R —> S is a homomorphism with ker# = / then # maps two 
elements of R to the same element of im# C S if and only if the two given 
elements of R differ by an element of I. Since factoring out I amounts to 
regarding two elements of R as equal if and only if they differ by an element 
of /, this means that each element of im# corresponds to just one element of 
R/1. So the homomorphism R —> im # becomes an isomorphism Rj I —> im #. 

»> 
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- Examples - 

#8 For any ring R. the identity map l: R —> R. (given by l(x) — x for 
all x G R) is a homomorphism. Clearly ker i — {0} and im l = R , and 
so the Fundamental Homomorphism Theorem says that R/{ 0} = R. The 
isomorphism guaranteed by 7.14 is {0} + oh t(a) = a. 

#9 For any rings R and S the zero map R —> S. defined by x t—> () s (for 
all x G R), is a homomorphism. Its kernel is the whole of R and its image is 
the zero subring of S. By 7.14, 

R/R = {0}. 

(Note that R/R has just one element, since R + x — R for all x G R.) 

#10 Let p: M[A] —> C be the homomorphism considered in #5 above, 
namely 

p(p(A)) = p(i) for all p G M[A], 

Clearly p is surjective (since every element of C is of the form a + bi for 
some a, b G M, and a + bi — p(a + bX)). So imp = C. As we saw in #5, 
ker p = (A 2 + 1)M[A], Hence 7.14 gives 

R[X] j (A 2 + 1)K[A] ^ C. 

Furthermore there is a isomorphism 

# M[A] /(A 2 + 1)M[A] —> C 


satisfying 

*!>(! + P ( x )) = p(p(X)) = p(i) 

for allp G M[A] (where we have written for ‘ (A 2 + 1)M[A] ’). In particular 
this says 

^>{1 + (a + &A)) — a + bi, 
in agreement with #7 above. 

#11 Let R and S be rings. Recall that the direct sum R + S of R and 
S is the set { (r, s) \ r G R, s G S } under componentwise addition and 
multiplication (see §2c#5). 

There is a homomorphism rj: R —> R + S given by r)(r) — (r, 0) for all 
r G R. Since kerp = {0} it follows that im 9 = R/{0} = R (by #8 above). 
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The image of 6 is the set R' — { (r, 0) | r E R }; we have thus shown that R' 
is a subring of R + S isomorphic to R. Similarly the set S' — { (0, s) \ s E S } 
is a subriug of R + S isomorphic to S. 

Now define ir: R + S —> R by 7r(r, s) — r. It is easily seen that % is a 
homomorphism and that im7r = R and ker tt — { (0, s) \ s E S } = S'. So in 
fact S' is an ideal of R + S, and R + S / S' = R. Similarly, R' is an ideal and 
R + S/R'^S. 

The isomorphism ijj: R + S/S' —> R given by 7.14 can be described 
explicitly in the following way. If (r, s) E R + S then the coset S' + (r, s) is 

{(x,y) | x E R, y E S and (x, y) — (r, s) G S' } 

— {(x,y) j x E R 1 y E S and (x — r,y — s) E S'} 

— {(x,y) | x E R, y E S and x — r — 0 } 

= { (r,y) \ y E S}. 

By 7.14 we have S' + (r, s)) = 7r(r, s); that is, 

^({ (r,y) | y E 5}) = r. 

In other words, what we have shown is this: 

For each r E R there is a coset of S' consisting of all ordered 
pairs (x,y) in R, + S such that the first component, x, is equal 
to r. This gives a one-to-one correspondence between elements 
of R and cosets of S'] that is, between elements of R and 
elements of R -j- S/S'. This correspondence is an isomorphism. 


Exercises 

1. (i) Prove that Q[v^2] as defined in Exercise 11 of Chapter Five is a 

subfield of M. 

(Hint: By Exercise 11 of Chapter Five, Q[v^2] is an integral 
domain. To prove that all nonzero elements of Q[-\/2] have 
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inverses in Q[^ 2 ] use §6k#15 of Chapter 6 and prove the 
formula 


(a + b\/2 + c(v / 2 ) 2 j = (e/d) + (f /d )\/2 + (g/d)[\Z2) 2 


where 


d — a 3 + 2 b 3 + 4 c 3 — 6 abc 


e — a 2 — 2 be 
f — 2 c 2 — ab 
g — b 2 — ac 


for inverses of elements of Q[v^ 2 ].) 


(ii) Prove that Q[v^2] is isomorphic to the quotient ring Q[X]/lL, 
where K is the set of all p(X) G Q[X] such that p(\/ 2) = 0. 

(Hint: Use the Fundamental Homomorphism Theorem and 
the evaluation homomorphism p(X) >—> p(\/ 2).) 

2 . Calculate the kernel and image of the homomorphism ip in Exercise 13 
of Chapter Six. 

3. Prove Theorem 7.5. 

4. Let R. be a commutative ring with 1. For x, y G R let L x\y' mean ‘there 
exists z G R with y — xz\ Prove that if a, b G R then aR — bR if and 
only if a\b and b\a. 

5. Let 9 be the homomorphism defined in Exercise 13 of Chapter Five: 



where the domain R of 9 is a subring of Mat(3, Z) and the codomain is 
Mat(2, Z). Prove that the kernel of 9 is equal to the set I of all matrices of 



Deduce that Rjl = Mat(2,Z), 


and give an explicit isomorphism. 

(Hint: R/I is the set of equivalence classes of the relation ~ defined 
in Exercise 13 of Chapter Five.) 


110 Chapter Seven: More Ring Theory 


6 . In each case show that / is an ideal of the given ring R , and find a 
homomorphism which has kernel equal to I: 

( i ) R — Z, I — 5Z 


K= {(t c) 


a, 6, c G 


I = 


a 0 
b 0 


a, b G 


(in) R = Z[X\ 

I — { &o + a\X + ■ ■ ■ + a n X" | Oj G Z, — 0 } 

(w) i2 = Z[X] 

/ = { a 0 + aiX H-b a n X n | a* G Z, a* = 0 (mod 2) } 

(u) 1? = Z[X] 

/ = { a 0 + aiXH-b a n X n | G Z, = 0 (mod 3) }. 


7. Let / be an ideal and S a subring in the ring R. 

(i) Show that S + I— {s + x\ s^S, a; G / } is a subring of R. 

(ii) Show that I is an ideal in S + I. 

(Hi) Show that 9: S —> (S + I) /1 given by 9(s) = / + s is a homomor¬ 

phism. 

(iv) Deduce that S fl I is an ideal in S and S/(S fl I) is isomorphic to 

(S + I)/1. (Hint: Use 7.14 and 7.3.) 

(v) Prove directly that S fl I is an ideal in S. 


8 . Prove that HA — nZ and B — rnL then A + B — dZ and An B — ZZ, 
where d — gcd (n,m) and l — lcm (n,m) and A + B is as defined in the 
previous exercise. 




8 

Field Extensions 


In §7d^7 we saw that the polynomial ring M[X] has a quotient ring isomor¬ 
phic to the field C of complex numbers. This is an example of a phenomenon 
we wish to study in more detail, as a method of constructing fields containing 
a given field F as a subfield. The fields to be constructed will be quotient 
rings of F[X], Our first step in this program is to study ideals in F[X], 


§8a Ideals in polynomial rings 

From now on we will only be concerned with polynomials over fields. 

8.1 Theorem Let F be a field and let I be an ideal of F[X], Then there 
exists a polynomial f(X) such that I = f(X)F[X], 

Proof. By 5.2.1 we know that 0 6/. If 0 is the only element of / then the 
assertion of the theorem holds with f(X) — 0. Thus we may assume that I 
contains nonzero elements. 

Of all nonzero elements of / choose f(X) to be one of minimal degree,f 
and let J — f(X)F[X], If p(X) E J then p(X) — q(X)f(X ) for some q, 
and, since f(X) E /, 5.8 (iv) yields that p(X) E L Thus J C I. Conversely, 
let p(X) E /, and let r(X) be the remainder on dividing p(X) by f(X). 
Then for some polynomial q we have r(X) — p(X) — q(X)f(X), and since 
p(X) and f(X) are both in / we deduce (by 5.8) that r(X) E I. By the 
choice of f(X) we know therefore that the degree of r(X) cannot be less 
than the degree of /(X); hence by Theorem 6.9 it follows that r(X') — 0. 
Thus p(X) — f(X)q(X) E f(X)F[X] — J, and we conclude that / C J. 
Hence I — J, as required. □ 


f Note the use of the Least Integer Principle in this step 
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Comment t>» 

8.1.1 This says that all ideals of F[X] are principal. Furthermore, in the 
above proof we have in fact shown that a nonzero ideal in F[X] is generated 
by any nonzero element of minimal degree contained in it. »> 

As a corollary of 8.1 we obtain the following proposition: 

8.2 Proposition Let I be an ideal in F[X] with I ^ F[X], and suppose 
that I contains an irreducible polynomial p(X). Then I = p{X)F[X}. 

Proof. By Theorem 8.1 there exists f(X) G F[X] with I — f(X)F[X}. 
Since p(X) el it follows that f(X)\p(X). Since p(X) is irreducible f(X) 
must be either an associate of p(X) or of degree zero. But if deg(/) = 0 
then by 7.6 (i) we obtain / = F[X], contrary to hypothesis. So / and p are 
associates, and therefore f(X)F[X] — p(X)F[X] (by Exercise 4 of Chapter 
Seven). □ 


§8b Quotient rings of polynomial rings 

Continuing with the notation of 8.1, let I — f(X)F[X\. We wish to inves¬ 
tigate the ring Q = F[X]/l. For simplicity we will use the bar notation for 
cosets: g(X) = I + g(X) for all g G F[X], 

8.3 Theorem Suppose that f(X) — co + c\X + • • • + c n X n , where n > 1, 
c r G F for each i, and c TI ^ 0. Then we have the following: 

(i) Each element of Q — F[X]/l is uniquely expressible in the form 

a o T o-iX + • • • + a n —\X n ~ 1 
with a o, ai,..., a„_i G F. 

(ii) The set F — {a | a G F} is a subring of F[X]/1 isomorphic to F. 

(Hi) The element X of Q satisfies the equation 

Co T C\X T ■ ■ ■ + c n X — 0. 


Proof, (i) An arbitrary element of Q is a coset of /, and hence equal 
to g(X) for some polynomial g G F[X], By Theorem 6.9 

(*) g(X) = q(X)f(X ) + (a 0 + ai X + --- + 
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for uniquely determined ao, aq, ..., o n _i G F. Since I is the set of all poly¬ 
nomials of the form q(X)f(X) it follows that equation (*) is equivalent to 
g(X) = ao + a\X + • • • + a n -iX n ~ 1 (mod /), and hence to 

(**) g(X) = a 0 + aiX 4-f a n _iX n_1 . 

So (**) holds for unique a*, as required. 

(ii) Define a mapping 6 : F —> F[X)/l by 6 (a) — a. Then 9 is a homo¬ 
morphism, since it is the restriction to the subring F of F[X] of the natural 
homomorphism F[X] —> Q. (See 7.12 and 5.5.2.) If a E F is in ker 9 then 
a = 0, and since a E F it follows from (i) that a = 0. (Alternatively, 
a — 0 means that a E /, and hence a is divisible by f(X). Since a is a 
constant and deg(jf) > 1 we must have a — 0.) Thus kerf? = {0}, and since 
imf? = { 9(a) \ a E F } — F Theorem 7.14 gives 

F = F / ker 9 = F 


in view of §7e^8. 

(iii) By the definition of addition and multiplication in a quotient ring, 

KX) + s(X) =r(X) + s(X) 
r(X)s(X) = r(X)s(X) 

for all r(X), s(X) E F[X]. Hence 

c 0 + C\X + •••-)- c n X — cq + c\X + •••-)- c n X n — f(X) 


which is equal to 0 since f(X) El. □ 

Comment »t> 

8.3.1 Part (ii) of 8.3 permits us to regard F as a subring of Q. in the 
same way as we have identified F with the set of constant polynomials in 
F[X], That is, we identify a with a for each a E F. Thus Q is a ring 
which contains F as a subring and also contains an element X satisfying 
Co + CiX + • • • + c n X — 0. That is, X E Q is a zero of the polynomial 
f(Y) — Co + C\Y + •••-(- c n Y n E Q[Y]. Furthermore, by (i) of 8.3, every 

- - 71 — 1 

element of Q is of the form ao + a ±X + • • • + a„_iJ with coefficients 
di E F. Hence Q — F[X]/f(X)F[X] can be regarded as a ring obtained 
from F by adjoining to F a new element X which is to be a zero of /. 
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These remarks should be compared with the remarks in 6.7.1. The 
polynomial ring F[X] is a ring obtained from F by adjoining an element X 
which satisfies no nontrivial equations. Now Q is obtained by adjoining to F 
an element X satisfying the equation cq + CiX + ■ ■ ■ + c ri X — 0. Furthermore 
we know by 7.12 that there is a homomorphism 

F[X]^Q = F[X]/l 
such that XkI, and, in general, 

do T cl^X T • • • T a r X r i—> o,q T d\X T • • • T a r X 

for any polynomial ao + a\X + ■ ■ ■ + a r X r G F[X], This is reasonable, since 
factoring / out of F [X ] amounts to regarding elements of /, in particular the 
element f(X) — Co + C\X + • • • + c n X n , as being equal to 0. t>» 

To complete our discussion of quotient rings of F[X] it remains to say 
what happens when p(X) is a constant polynomial. There are two cases, 
both trivial. 

8.4 Theorem Let p(X) — a G F, and let I — p(X)F[X], 

(i) If a = 0 then I = {0} and F[X]/l = F[X}. 

(ii) If a ^ 0 then I — F[X] and F[X]/l = {0}. 

Proof. Part (i) is immediate from §7e^8, and, in view of 7.6 (i), Part (ii) 
is immediate from §7e^9. □ 


- Examples - 

#1 Let R — Q[X]/(X 2 — 3)Q[X], We use the bar notation again: if 
g(X) G Q[W] then g(X) denotes the element (X 2 — 3)Q[W] + g(X) of the 
ring R. By the discussion in 8.3.1 we know that R. can be thought of as the 
result of adjoining to Q an element X satisfying X —3 = 0. Every element 
of R will have the form a + bX for some a, b G Q, and the following rules 
hold for addition and multiplication in R: 

(a + bX) + (c + dX) — (a + c) + (b + d)X 
(a + bX)(c + dX) — ac + (ad + bc)X + bdX 2 
— (ac + 3 bd) + (ad + bc)X 


-v 2 

since X 


3 . 
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There is another way to adjoin to Q an element whose square is three; 
namely, consider the set Q[a/3] of all real numbers of the form a + by/3 
with a and b in Q. We saw in §5a^6 that Q[\/3] is a subfield of R. The 
above considerations suggest that this subheld of R ought to be isomorphic 
to Q[W]/(W 2 — 3)Q[X], It is easy to prove this by using the Fundamental 
Homomorphism Theorem. 

Define 6>:Q[W] ^ M by 0(g(X)) = g(y/3) for all g G Q[X], Then 6 is 
a homomorphism. Since (a/ 3)* is in Q if i is even and of the form qy/3 with 
q G Q if i is odd, we have 

im.0 = { ao + aiVs + a2(v / 3) 2 H-+ a„( y/3) n | 0 < ra £ Z, cq G Q } 

= { a + by /3 | a, b G Q } 

= Q[Vs]. 


Moreover, since (by 6.9) any element of Q[X] is expressible in the form 
(X 2 — 3)q(X) + (a + bX ) with q(X) G Q[X] and a, b G Q, it follows that 

ker 0 — {g E Q[W] | ^(x/3) = 0 } 

= { (X 2 — 3)q(W) + a + bX | q G Q[W], a, b G Q, a + by /3 = 0 } 

- { (X 2 - 3 )q(X) | q G Q[X] } 

= (X 2 -3)Q[X}. 

By 7.14, 

Q[X]/(W 2 -3)Q[W]^Q[v/ 3] 
and there is an isomorphism satisfying 

a + bX = (X 2 - 3)Q[W] + (a + bX) \—>a + bV3 
for all a, b G Q. 

#2 Prove that Q[v^2] = { a + by /2 + c(y/ 2 ) 2 | a, b, c G Q } is isomorphic 
toQ[W]/(X 3 - 2 )Q[W], 

1 > Dehne 0:Q[X] —> R by <j)(p(X)) = p{y/ 2). By Theorem 6.8 we know 
that 4> is a homomorphism. Now any rational linear combination of powers 
of y/2 lies in Q[v^2], since (y/2 ) 1 is in Q if i = 0 (mod 3), of the form qy /'2 
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with q G Q if i = 1 (mod 3) and of the form q(\/2) 2 with q G Q if i = 2 (mod 
3). Thus 


im (j) — { ao + ai ^2 + • • • + a n {\/ 2) n 0 < n 6 Z a.; 6 Q } 

= { a + 6\/2 + c(v / 2) 2 | a, 6, c e Q } 

= Q[^2]. 

By Theorem 7.14 it follows that Q[v^2] = Q[X]/ker </>, and it remains for us 
to prove that ker <fi — (X 3 — 2)Q[X]. Now certainly X 3 — 2 e ker (j), since 
</>(X 3 — 2) = (v^2) 3 — 2 = 0. But since X 3 — 2 is irreducible in Q[X] (by Eisen- 
stein’s Criterion—see §6k#14), it follows from 8.2 that ker 0 = (X 3 —2)Q[X], 
as required. <3—C 

#3 Let Q = M[X]/X 3 M[X], Then by 8.3 the set 

M' = {X 3 M[X] +t I t e M} 

is a subring of Q isomorphic to M, and a — X 3 M[X] + X is an element of 
Q satisfying a- 3 = 0. Every element of Q is uniquely expressible in the form 
to + t\a + t 2 a 2 with to, ti, t 2 £ Mb The rule for addition in Q is obvious: 

(so + S\cx + s 2Ct 2 ) + (to + t\cx + t2Q?) — (so T to) T ( s i T t\)oi + (§2 + t2)o : 2 

for all Si, t{ e M (i — 0, 1, 2). To multiply two elements of Q, simply expand 
the product and use ct 3 = 0: 

(so + sia+S2a 2 )(to+^i a +^2tt 2 ) — sotoT( s o^iT s i^o) a T( s o^2T s i^i + s 2^o) cl;2 - 


#4 Let T — M[X]/XM[X], Using the bar notation again, we have 

X — 0 — zero element of T, 

since X is in the ideal (In factoring out I1[I] all elements of 

IR[I] are regarded as being zero, and this includes the element X.) So 
in accordance with 8.3 the ring T can be thought of as obtained from M by 
adjoining to M an element X satisfying X = 0. 

Adjoining 0 to M doesn’t do a lot—0 is already an element of M. So we 
should have that T = M. Again this can be proved using 7.14. There is a 
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homomorphism 0: M[X] —> M given by cp(p(X)) = p(0) for all p € R[W]; that 
is, 

4>{ci o + a\X + • • • + a n X n ) = ao- 

Clearly iin (p — R and ker</> — so 7.14gives R[I]/lR[I] = R. This 

can also be seen directly by observing that 


ao + ai-X” + • • ■ + a n X n = a o (mod X R[W]), 
and hence that every element of R[X]/X R[X] is equal to a for some a G R. 


8c Fields as quotient rings of polynomial rings 


8.5 Definition If F is a subfield of a field E then E is called an extension 
of F. More generally, if E has a subfield isomorphic to F we say that E is 
an extension of F. 

We have seen that R[A]/(W 2 + 1)R[W] is a field (isomorphic to C) 
containing R as a subfield, and that Q[X]/ (X 2 — 3)Q[W] is a field (isomorphic 
to Q[\/3]) containing Q as a subfield. However, R[W]/W 3 R[W] is not a field, 
since it contains zero divisors. (The element a — X 3 M[W] + X is nonzero 
but satisfies a 3 — 0, as we saw in #3.) We are led to wonder under what 
circumstances a quotient ring of F[X] is a field. 

Analogy with quotient rings of Z provides a clue to the answer. We 
have seen (Theorems 4.10 and 4.11) that Z/nZ is a field if and only if n is 
prime; that is, the quotient is a field if and only if the ideal is generated by 
a prime. Exactly the same is true for quotients of F[X], 

8.6 Theorem Let F be a Geld and p(X) e F[X]. Then F[X]/p(X)F[X] 
is a held if and only if p is irreducible. 

Proof. Let I — F[X]/p(X)F[X], and suppose first that p(X) is not irre¬ 
ducible. Then either p(X) is a constant or else deg(p) > 1 and p(X) has a 
factorization p(X) — s(X)t(X ) for some s and t of degree less than deg(p). 

If p(X) is a constant then by 8.4 F[X\/l is isomorphic either to F[A], 
which is not a field since polynomials of degree greater than 1 do not have 
inverses in F[X], or to the trivial ring {0}, which is not a field since it does not 
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have a nonzero identity element. We are left with the case p(X) — s(X)t(X) 
with 1 < deg(s) < deg(p) and 1 < deg(£) < deg(p). Now since p(X) E I we 
find that 

(/ + s(X))(I + t(X)) = I + s(X)t(X) = I + p(X) =1 = 1 + 0 

the zero element of F[X)/l. Moreover, p(X) cannot be a factor of s(X) or 
t(X) (since 1 < deg(s) < deg(p) and 1 < deg(£) < deg(p)), and therefore 
I + s(X) ^ / and I + t(X) ^ I. So the ring F[X]/l has zero divisors, hence 
is not an integral domain, hence is not a field. 

Suppose, on the other hand, that p(X) is irreducible. We must prove 
that Q = F[X\JI is a field. Since it is certainly a ring, it suffices to prove that 
it is commutative and has a nonzero identity, and that all nonzero elements 
have inverses. 

Let a, [5 E Q. Then a = I + f(X), [5 = 1 + g(X) for some f(X) and 
g(X) in F[X], and 

a(5=(I + f(X)(I + g(X)) = I + f(X)g(X) 

= I + g(X)f(X) = (I + g(X))(I + f(X)) = (5a 
by commutativity of F[X]. Furthermore, 

a(I + 1) = (I + /(X))(/ + 1) = / + /(X) 1 

= 1 + 1 f{X) = (/ + 1)(I + f{X )) = (/ + l)a 

where 1 is the identity of F. Thus Q is commutative and has an identity. 
The identity is nonzero since 1+1 = 1 would imply that 1 El and hence 
that p(X) 11, which is impossible since deg(p) ^ 0. 

Let a be a nonzero element of Q , and let f(X) be an element of F[X] 
such that a = I + f(X). Then f(X) ^0 (mod /), since 

I + f(X) ^ I = zero element of Q , 

and so p(X)/f(X). Thus the gcd of p(X) and f{X) cannot be an associate 
of p(X), and since p(X) is irreducible the only other divisors it has are 
polynomials of degree 0. So the gcd of p(X) and f(X) must be 1. Now by 
6.14 there exist m(X) and n(X) with m(X)p(X) + n(X)f(X) = 1, and this 
gives 

a(I + n(X)) = (I + f(X))(I + n(X)) 

= I + f(X)n(X) 

= I + (1 - m(X)p(X)) 

= 1 + 1 

since m(X)p(X) E I. Thus a has an inverse, as required. □ 
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- Examples - 

#5 Let F = R and p(X) = X 2 - 3X + 2 = (X - 1)(X - 2). Then p(X) 
is not irreducible, and so Q — M[X]/p(X)M[X] is not a field. Indeed, 

X^l = p(X)M[X] + (X - 1) 

and 

XW2 = p(X)M[X] + (X - 2) 

are nonzero elements of Q with product zero: 

(X - 1) (X - 2) = X 2 - 3X + 2 
= 0, 

since X 2 - 3X + 2 = 0 (mod p(X)R[X]). 

#6 Let F — Q and p{X) — X 2 —3. We have seen that X 2 —3 is irreducible 
in Q[X] (§6i#6), and so Q[X]/(X 2 — 3)Q[X] is a field. We had noted this 
already in #1 above. 

#7 The polynomial X 2 +1 is irreducible in M[X], hence the quotient ring 
M[X]/(X 2 + 1)M[X] is a field (as we had seen in §7d^7). 

#8 We saw in §6j#12 that X 3 + X + 1 is an irreducible polynomial in 
Z 2 [X]. Hence if I — p(X) Z 2 [X] we have that K — Z 2 I X]/l is a field con¬ 
taining Z 2 as a subfield. By Theorem 8.3 each element of K is uniquely ex¬ 
pressible in the form ao+aiX+o^X with ao, ai, 02 £ Z 2 (where X = I+X). 

Since there are exactly two choices (0 or 1) for each of ao, ai and a- 2 , there 

_ _ 2 

are exactly eight possible expressions ao + aiX + a 2 X . So K is a field with 
eight elements. 


§8d Field extensions and vector spaces 

Let F be a field and p(X) a polynomial over F. of degree n > 1. Let 
Q — F[X]/l, where I — p(X)F[X], By Theorem 8.3 each element of Q 

- - TL — 1 

is uniquely expressible in the form ao + aiX + • • • + a„_iX where the 
coefficients a 0 , ai, ..., a n _ 1 are elements of F and X — I + X. We deduce 
the following lemma: 
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- - 71 — 1 

8.7 Lemma The elements 1, X, .. ., X form a basis for Q considered as 
a vector space over F. 

Proof. The proof that Q is a vector space over F is a straightforward 
checking of Property (*) and Axioms (i)-(viii) listed in §0c, and is omit¬ 
ted. As observed above, every element of Q can be expressed as a linear 
combination of the given elements; that is, they span Q. Suppose that 

- - 71 , — 1 

ao + aqX + •••-)- a n -\X = 0 with the aq G F. Then 

a 0 + a{X + • • • + a n _ il"' 1 = 0 + OX + • • • + OX"” 1 

and by the uniqueness part of 8.3 (i) it follows that all the a* are equal to 0. 
This proves linear independence. □ 

As a consequence of the lemma we have the following: 

8.8 Theorem Let Q = F[X]/p(X)F[X) where F is a Geld and p a poly¬ 
nomial over F of degree n > 1. Then Q is a vector space over F, and the 
dimension of this vector space is n. 

In the situation described above, if p is irreducible then Q is an exten¬ 
sion field of F. In general, if E is any extension of a field F then (*) and 
(i)-(viii) of §0c are satisfied, and so E may be regarded as a vector space over 
F. The dimension of this vector space is called the degree of the extension. 

8.9 Definition If E is an extension field of F the degree of E over F. 
denoted by L [E : F]\ is the dimension of £ as a vector space over F. 

Comments »> 

8.9.1 There is no guarantee that [E : F] is finite. 

8.9.2 If F[X]/p(X)F[X] is a field then its degree over F equals deg(p). 

000 


8e Extensions of extensions 


8.10 Theorem Suppose that F, E, K are Gelds with F C E C K. and 
suppose that [K : E] — m and [E : F] — n. Then [K : F] — mn. 
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Proof. Let x±,X 2 , ■ ■ ■, x m be a basis for K over E and let yi, 1 / 2 , ■ ■ ■, y n be 
a basis for E over F. We show that 

8.10.1 xxuu X\y 2 i • • • i X\y ni X 2 y\i • • •> X 2 ym .. *£m2/i5 ■ ■ ■ ■ x m y n 

is a basis for I\ over F. 

We prove first that the elements 8.10.1 span K over F. Let t G K. 
Then since {aq,..., x m } spans K over E there exist si, S 2 , ■ ■ •, s m G E with 

t = S 1 X 1 + s 2 x 2 H-h s m x m . 

Now each Sj is an T-linear combination of yi, y 2 , ..., y n . since these elements 
span E over F. So we have 


51 = W11Z/1 + U12V2 H-h u ln y n 

52 = u 2 iyi + u 2 2y2 H-h u 2n y n 


Sm — u m iyi T u m 2 y 2 T ■ ■ ■ T u mn y n 
with the coefficients Uij in F for all i and j. Now substituting gives 
t ^i\y\X\ -\- u\2y2X\ T ■ ■ ■ T U\ n y n xi T.T u mn y n x m ^ 

an T-linear combination of the elements 8.10.1. 

Now we must show that the elements 8.10.1 are linearly independent 
over F. Suppose that My (1 < i < m, 1 < j < n) are elements of F such 
that Ym=i tU u ijVj x i = 0. Then 

n n 

Xl + (y^U2jyj]x 2 H-b fy ^u mj yj]xm = 0, 

3 = 1 3= 1 

and since each coefficient Y^=i u ijVi an element of E and x±, X 2 , • • •, x m 
are linearly independent over E we deduce that Y^j=i u i:iV:i = 0 f° r each i. 
Now the linear independence over F of yi, y 2 , • • •, y n gives u i; j — 0 for all i 
and j, as required. □ 
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§8f Algebraic and transcendental elements 

Let E be an extension field of F and let a G E. Any subring of E containing 
F and containing a clearly must contain a 2 , a 3 , a 4 , ..., and hence must 
contain everything of the form ho + h\a + • • • + h n a n with 0 < n G Z and 
bo, hi, ..., b n G F. That is, it must contain everything of the form /(a) for 
f(X) G F[X], 

8.11 Definition Let F[a\ be the subset of E defined by 

F[a\ = {/(a) | f(X) G F[X}}. 


Any subfield S of E which contains F and also a certainly contains F[a\ 
(since S is also a subring). So if u, v G F[a\ and v ^ 0 then it follows that 
uv~ l G S. 

8.12 Definition Let F(a) = {uv -1 \ u, v G F[a\ and v ^ 0}. 

8.13 Theorem Let E be an extension Geld of F and let a G E. 

(i) F[a] is a subring of E containing F and a, and any subring of E con¬ 
taining F and a contains F [a]. 

(ii) F(a) is a subGeld of E containing F[a]. Any subGeld of E containing 
F and a contains F(a). 

Proof, (i) We proved in the discussion above that every subring contain¬ 
ing F and a contains F[a]. That F[a] is a subring follows from 7.3, since 
F[a] is the image of the evaluation homomorphism f(X) t —* /(a) from F[X] 
to E. 

(ii) It suffices to prove that F(a ) is a subfield, since the other assertion was 
proved above. We use Theorem 5.3. 

Since F contains the zero and identity of E, so too does F(a). Now if 
x, y G F(a ) then x — uv~ x and y — st~ x for some u, v, s, t G F[a], and by 
the closure properties of the subring F[a] we have that ut + vs, us and — u 
are all in F[a\. Hence 

x + y — uv~ 1 + st^ 1 — (ut + us)(ut) _1 G F(a ) 
xy — (uW 1 ) (st _1 ) = (ws)(t>f) -1 G F(a ) 

—x — — (rm -1 ) = (— u)v~ l G F(a ) 

while if u ^ 0 then vu~ l G F(a). So all the requirements of Theorem 5.3 are 
satisfied, and F(a ) is a subfield. □ 
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Theorem 8.13 justifies the following terminology: 

F[a\ is the subring of E generated by F and a, 

F(a ) is the subfield of E generated by F and a. 

8.14 Definition Let E be an extension field of F. and let a E E. If there 
exists a nonzero polynomial f(X) E F[X] with /(a) = 0 then a is said to be 
algebraic over F. Otherwise a is said to be transcendental over F. 

8.15 Theorem Let E be an extension field of F and let a E E. 

(i) If a is transcendental over F then F[a\ = F[X] (where X is an indeter¬ 
minate), and F(a) F[a]. 

(ii) If a is algebraic over F then 

(a) there exists a unique monic irreducible polynomial p(X) E F[X] 
for which p(a) = 0, 

(b) F[X\/l = F[a\, where I — p(X)F[X] is the principal ideal of 
F[X] generated by p, 

(c) F(a ) = F[a\. 

Proof. Let be the evaluation homomorphism F[X] —> E given by the 
rule <t>(f(X)) = f(a) for all f(X) E F[X}. Then by 7.14, 

F[X]/ ker = im</> 

= {Hf(X)) \ flX) € F[X}} 

= { /(«) I f(X) 6 F[X] } 

= f M- 

(i) Suppose first that a is transcendental over F. Then there are no nonzero 
polynomials f(X) E F[X] with /(a) = 0, and so 

ker (p = { f(X) | </*(/(X)) = 0} = {0}. 

Thus F[a\ = T[X]/{0} = F[X], and we have proved the first assertion in (i). 
Since F[X] is not a field we deduce (by §5b^ll in Chapter 5) that F[a\ is 
not a field, and therefore F(a) F [a]. 

(ii) Suppose now that a is algebraic over F. By Definition 8.14 there exist 
nonzero elements of F[X] of which a is a zero, and therefore ker <f {0}. Let 
p(X) be a nonzero polynomial of minimal degree in ker <f>. Since associates of 
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elements of ker <p will also be in ker</>, we may choose p{X) to be monic (by 
6.12.1). We have ker <f> — p(X)F[X\ (by 8.1.1), and also p(a) — 4>(p(X)) — 0 
(since p(X) G ker0). Clearly deg(p) > 1, since p(X) is nonzero and p(a) = 0. 
If s(W) and t(X) are polynomials of smaller degree than p(X) such that 
p(X) — s(X)t(X) then since s(a)t(a ) = p(a) — 0 and the field E can have no 
zero divisors it follows that either s(a) — 0 or t(a) = 0. But this contradicts 
the choice of p(X) as a polynomial of minimal degree of which a is a zero. 
So p(X) has no such factorization, and is therefore irreducible. 

To complete the proof of (a) it remains to show that p(X) is the unique 
monic irreducible element of F[X] of which a is a zero. So, assume that 
q(X) G F[X] is irreducible, monic and satisfies q(a) — 0. Then (f>(q(X)) — 0, 
and consequently 

q{X) G ker <f> = p(X)F[X]. 

Thus p(X)\q(X), and since q(X) is irreducible and deg(p) > 1 it follows that 
p(X) and q(X) are associates. Because they are both monic this implies that 
q(X)=p(X). 

Since F[X]/ ker 4> = F[a\ and ker (f) = p(X)F[X], part (b) has been 
proved. We know by Theorem 8.6 and §5b#ll that F[a\ is a held; so the 
second assertion in 8.13 (ii) yields that F[a\ contains F(a). But the hrst 
assertion in 8.13 (ii) gives the reverse inclusion, and therefore F(a) — F[a ], 
proving (c). □ 

Comment t>» 

8.15.1 The polynomial p(X) in 8.15 (ii) is called the minimal polynomial 
of the algebraic element a. Note that the minimal polynomial is always 
irreducible. Note also that if p is the minimal polynomial of a then F[a ] is 
an extension of F of degree equal to deg(p). »> 

- Example - 

#9 If F is a subheld of M and 0 < a G M with a ^ F then 
F{a) = F[a\ — { x + y\fa \ x, y e F} 


is a subheld of M, and is an extension of F of degree 2. 
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§ 8 g Ruler and compass constructions revisited 

We finally have the machinery at hand to deal with the classical geometrical 
problems described in Chapter 1. Reformulating Theorems 1.1 and 1.2 gives 
the following characterization of constructible numbers: 

8.16 Theorem A real number t is constructible if and only if there is a 
finite sequence of subfields of R 

Q = Fi C F 2 C • • • C F n 

such that t G F n and for each i — 1, 2, ..., n — 1 there is an a, G F t such 
that F i+ 1 = Fi(^/cq). 

Proof. Suppose firstly that we are given such a chain of subfields F{ of 
R: we will prove that all the elements of the subfields are constructible 
numbers. Since 1 is constructible it follows from 1.1 that all elements of Q 
are constructible; that is, F\ C Con. But if F) C Con and a, b G F) then 
a, 6, aq G Con, and by 1.1 it follows that a + b^/al G Con. So F ) + 1 C Con, 
and by induction all fields in the chain are contained in Con. 

Conversely, let t be a constructible number, and let a be a constructible 
point one of whose coordinates is t. Let ao, a±, ..., a n — a he the points 
obtained in a ruler and compass construction, listed in the order obtained. 
(Thus «o = (0, 0) and op = (1,0).) For each i let F) be the set of all real num¬ 
bers obtainable from the coordinates of «o, «i, ■ • •, by finite sequences of 
operations of addition, subtraction, multiplication and division. Clearly each 
Fj is closed under these operations, and we see from 5.3 that F) is a subheld of 
R. Moreover, by 1.2 there exists a* G F) such that cp+i = (s+f^/ai, u+Vy/ai) 
for some s, t, u, v G F). If t — v — 0 we can replace aq by 0, and in this 
case we have F^+i — F\ — F)( v /ai). If t or v is nonzero then all elements 
of Fi{^J~a~i) can be obtained from the coordinates of a and elements of F, by 
finite sequences of held operations, and we see that the held Fi + 1 is equal to 
Fi(y/ai) in this case too. Hence we have a sequence of helds of the required 
kind with t G F n . □ 

Now we can dispose of the Delian Problem: 

8.17 THEOREM The number \/2 is not constructible: a cube cannot be 
duplicated by ruler and compass. 

Proof. Suppose that v^2 is a constructible number. By 8.16 there is a 
sequence of helds Q = iq C F 2 C • • • C F n , each a quadratic extension 
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of the preceding, with \/2 G F n . Let Fi — F % - i (s/aEC) , and assume that 
Fi ^ Fi_ i- (If Fi — F^ i simply delete Fi from the sequence.) Then yjai- 1 is 
not in Fi- i, and so X 2 — «q_i is an irreducible polynomial in F)_i[A], Hence 

[Fi : Fi- 1 ] = deg(W 2 - aq_i) = 2. 

By Theorem 8.10 it follows that [F n : Q] = 2 n . But since G F n it follows 
that Q(v^2) C F n . Furthermore, as we have seen, X 3 — 2 is an irreducible 
element of Q[A], and therefore 

[Q(^2) : Q] = deg(W 3 - 2) = 3. 

By Theorem 8.10 we have 

2 n = [F n : Q] 

= \F n : Q(v // 2)] [Q(v^2) : Q] 

= 3 [F n : Q( v^2)], 


—a contradiction, since the degree [F n : Q(v^2)] must be an integer, but 2 n 
is not divisible by three. □ 

A similar proof applies for k — cos(|). Since cos 39 = 4 cos 3 9 — 3 cos 6 
we have that 4 k 3 — 3k = that is, k is a zero of 8X 3 — 6X — 1. This 
polynomial is irreducible over Q[X], and so [Q(k) : Q] = 3. Thus, by the 
same argument as above, k cannot lie in an extension field of Q of degree a 
power of two. Thus we have proved 

8.18 Theorem The number cos(^) is not constructible; an angle of sixty 
degrees cannot be trisected. 

I am forced now to confess that the third classical problem is beyond 
the scope of this course. The proof that y/n is not constructible depends on 
showing that tt is transcendental over Q; that is, 7r is not a zero of any poly¬ 
nomial equation over Q. From this it follows that is also transcendental, 
and hence not constructible (since it follows readily from 8.16 that every 
constructible number is algebraic over Q). To prove that 7r is transcenden¬ 
tal would require a lengthy digression into Number Theory. The interested 
reader is referred to Hardy and Wright “An Introduction to the Theory of 
Numbers” (4 th ed.) §11.14, p.173. 
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§ 8 h Finite fields 

Although we have answered the questions we set out to answer, it would be 
a shame to leave the subject without a few words on fields which have only 
finitely many elements. If p G Z + is prime then Z p is a field with exactly p 
elements (by Theorems 4.10 and 4.11). We have also seen (#8) how a field 
can be constructed which has exactly eight elements. It is natural to wonder 
for which positive integers n a field can be found with exactly n elements. 

8.19 Theorem Let F be any field. Then the characteristic of F is either 
zero or a prime number. 

Proof. Suppose to the contrary that F has characteristic m and that m is 
composite (that is, not prime). Then m — rs for some r, s G {2, 3,..., m — 1}. 
By 5.11 the elements rl and si of F are nonzero (since r and s are less than 
the characteristic of F), but (rl)(sl) = ml = 0. This contradicts the fact 
that there are no zero divisors in a field. □ 

Now suppose that F is a field with exactly n elements. Then the subset 
S of F defined by 

S — {nl j n G Z } 

has only a finite number of elements, and so it cannot be isomorphic to Z. 
So by Theorem 5.12 it follows that S = Z p , where p is the characteristic of 
F. and by 8.19 we know that p must be prime. Therefore we have proved 
the following: 

8.20 Proposition A finite Geld F must be an extension of Z p for some 
prime p. 

The field F can be regarded as a vector space over the subfield S = Z p , 
and since F has only finitely many elements this must certainly be a finite 
dimensional vector space. So the degree [F : Z p ] of the extension is finite. 
From this we deduce the next theorem. 

8.21 Theorem Suppose that F is an extension ofZ p (where p is prime), 
and suppose that [F : Z p ] = k. Then F has exactly p k elements. 

Proof. Let ai, a 2 , ■ ■ •, a& be a basis for F over Z p . Then each element of 
F is uniquely expressible in the form Aiai + \ 2 Ci 2 + ■ ■ ■ + Awith the Ai 
in Z p , and since there are p choices for each A i there are p k such expressions 
altogether. □ 
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It is also possible to show that for each prime power p k there is a field 
with p k elements, and that any two fields with p k elements are isomorphic. 
To construct such a field one simply has to find an irreducible polynomial 
/ G Z P [X\ with deg(/) = k. for then 8.6 and 8.9.2 show that F — Z p [X\/l 
(where I = f(X)Z p [X j) is a field satisfying [F : Z p ] = k. 

8.22 THEOREM For each prime power q there is (up to isomorphism) a 
unique held with q elements. 

8.23 Definition The field referred to in 8.22 is called the Galois field with 
q elements. It is commonly denoted by i GF(q)\ 

We omit the proof of Theorem 8.22, but give several examples to illus¬ 
trate Galois fields. 

- Examples - 

#10 GF( 4) 

The only polynomials of degree 1 in Z 2 [X] are X and X + 1. Therefore the 
only reducible polynomials of degree 2 are X 2 , X(X + 1) = X 2 + X and 
(.X + l) 2 = X 2 + 1. (Of course (X + l) 2 is equal to X 2 + 2X + 1, but 2 = 0 
in Z 2 .) Hence p(X) — X 2 + X + 1, the remaining polynomial of degree 2, 
must be irreducible. Let I — p(X) Z 2 [X] and Q — Z 2 [X]//. Then Q is an 
extension field of Z 2 of degree equal to deg(p) = 2, and if a — I + X then 
1, a is a basis for Q over Z 2 (by 8.7). Thus Q has exactly 4 elements: 

0 = 0-1 + Oct: a = 0 • 1 + la 

1 = 1-1 + Oct ck + 1 = 1-1 + let. 

The rules for addition and multiplication of these elements are completely 
determined by the equation a 2 + a + 1 = 0 together with the fact that the 
characteristic of Q is 2. Thus, for example, 

Ct(CK + 1) = Oi + OL — (ct + OL + l) + 1 = 1. 

Similar calculations yield the complete addition and multiplication tables for 
GF( 4). (We have omitted 0 from the multiplication table since the rule for 
multiplying by 0 is trivial: Ox = 0 for all x.) 
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#11 GF (16) 

Suppose that /(X) G Z 2 [X] is reducible and lias degree 4. Then one of the 
following cases must occur: 

(a) /(X) has four irreducible factors of degree 1. There are five possibilities 
for f(X): 

X 4 X(X + 1) 3 

X 3 (X + 1 ) (X + l) 4 
X 2 (X + l) 2 

(b) f(X) has two irreducible factors of degree 1 and one of degree 2. There 
are three possibilities: 

X 2 (X 2 + X + 1) (X + 1) 2 (X 2 + X + 1) 

X(X + 1)(X 2 + X + 1) 

(c) /(X) has one irreducible factor of degree 1 and one of degree 3. There 
are four possibilities: 

X(X 3 + X + 1) X{X 3 + X 2 + 1) 

(X + 1)(X 3 + X + 1) (X + 1)(X 3 + X 2 + 1) 

(d) /(X) has two irreducible factors of degree 2. The only possibility is 

/(X) = (X 2 + X + l) 2 . 

So there are thirteen reducible polynomials of degree 4. Since there are 
sixteen polynomials of degree 4 altogether it follows that there are three 
irreducible ones, and they can be found by writing down all sixteen elements 
of Z 2 [X] of degree 4 and crossing out the thirteen reducible ones above. The 
irreducibles are 

X 4 + X + l, X 4 + X 3 + l, X 4 + X 3 + X 2 + X + 1, 

and we can use any of these in the construction of GF( 16). For instance, let 
I = (X 4 +X + 1)Z 2 [X] and t — I + X G Z 2 [X\fl. Then the sixteen elements 
of Z 2 [X]//are: 

0, 1, t 2 + t, t 2 + t + 1 (forming a subfield with 4 elements) 

t, t 2 , t + 1, t 2 + 1 (which are zeros of X 4 + X + 1) 

t 3 , t 3 + t 2 , t 3 +t 2 + t+ 1, t 3 + t (zeros of X 4 + X 3 + X 2 + X + 1) 
t 3 + 1, t 3 + t 2 + 1, t 3 + t 2 + t, t 3 + t + 1 (zeros of X 4 + X 3 + 1). 
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#12 GF( 9) 

As in our previous examples we can explicitly determine the reducible poly¬ 
nomials of degree 2 in Z 3 [X]. and deduce that the remaining ones are irre¬ 
ducible. We find that there are three monic irreducibles, namely X 2 — X — 1, 
X 2 + X — 1 and X 2 + 1. Now GF( 9) can be constructed by adjoining to 
GF(3 ) = Z 3 a zero of one of these polynomials (we can choose whichever we 
like). Thus, for instance, if s is a zero of X 2 + 1 then GF(9) consists of 


0 , 1 , -1 

s, — s 
s + 1, —s + 1 
s — 1, — s — 1 


(lying in the subfield Z 3 ), 
(zeros of X 2 + 1), 

(zeros of X 2 + X — 1), 
(zeros of X 2 — X — 1). 


The addition table is easy to write down provided that you remember that 
1 + 1 = —1 and s + s = —s (since 3 = 0 in a ring of characteristic three). 
The multiplication table is also straightforward, using s 2 + 1 = 0. 

#13 GF{ 27) 

There are eight monic irreducible polynomials of degree 3 over GF( 3), and 
GF (27) contains three zeros for each of them (making 24 elements) along 
with the three elements of the subheld GF( 3). (Note that GF(9) is not a 
subheld of GF( 27)—in general, GF(q 1 ) is a subheld of GF(q 2 ) if and only if 
q 2 is a power of qi.) 


Exercises 

1. Suppose that R is a ring of characteristic three with identity element 1. 
(!) Is the set {0,1, —1} a subring of R? 

(ii) Suppose that 1 G fl and t 2 + 1 = 0. Show that there are at most 
nine elements of R given by polynomial expressions in t, and write 
down nine such expressions giving these elements. Prove that these 
elements form a subring S of R. Is S' a held? 

2. Prove that the ring Q[X]/(X 2 — 2)Q[X] is isomorphic to the held of all 
real numbers of the form a + b\/2 with a, b G Q. 
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(z) Show that the equation of the line joining the points (1 + \/2,1) 
and (V2, \J 1 + \/2) has coefficients in F. 

( ii ) Calculate the the coordinates of the point of intersection of the 
lines 

V2x — \J 1 + V2 y — 1 

(1 + V2)x + y — 1 + \J 1 + V2 

and show that they lie in F. 

4. Find the smallest subfield F of M which contains the coordinates of the 
point of intersection of the circle with centre (0, 0) and radius \/3 and the 
line joining (1/2,0) and (4\/2, \/2). Calculate also the degree [F : Q]. 

5. Let J — (X 3 +X + 1)Z 2 [X], Express the following elements of Z 2 [X]/ J 
in the form J + (aX 2 + bX + c) with a, 6, c G Z 2 : 

(i) J + X 5 . (ii) J + (X 4 + X + 1). 

6. Let I be the principal ideal in Q[X] generated by X 4 — 2X 2 + 1. For 
each of the following elements of Q[X]// determine whether an inverse 
exists, and, if one does, find it. 

(0 I + (X 2 + X + 1). (ii) I + (X 2 + X — 2). 

7. For the given field F and polynomial s(X) G F[X], list all the elements 

of the ring Q — F[X]/s(X)F[X] and construct a multiplication table 
for Q. 

(i) F = Z 2 , s(X) = X 2 + 1 

(ii) F = Z 3 , s(X)-X 2 + l 

(in) F = Z 2 , s(X) = X 3 + X + 1 

(iv) F = Z 3 , s(X) = X 2 — X + 1 

8. Prove that \/5 ^ Q[\/2] by attempting to solve \/5 — x+yV2 for rational 
numbers x and y. 
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9. (i) Prove that y/b ^ Qfv^] by attempting to solve 

y/b = x + yy/2 + z(y/2) 2 

for x, y, z G Q. 

(Hint: Use the fact that 1, y/2 and (\/2) 2 are linearly inde¬ 
pendent over Q.) 

Prove that y/5 Q[v^] by considering degrees of field extensions. 

(Hint: If y/b G Q[v^2] then Q C Q[\/5] C Q[\/2].) 

Use considerations of degree to prove that y/vf / Q[\/43]. (Note 
that a computational proof of this fact would be rather messy!) 

10. Let E = Q[\/2], K = E\y/b\. 

(i) Calculate the degree [K : Q], 

(ii) Observe that t — y/2 + y/b G K. Can the numbers 1, t, f 2 , f 3 , t 4 
be linearly independent over Q? 

(in) Prove that y/2 + y/5 is algebraic over Q and find a polynomial 
f(X ) G Q[X] such that f(y/ 2 + y/b) — 0. 

11 . Let F be a finite field and S a subfield of F. Prove that the number of 
elements of F is a power of the number of elements of S. 

(Hint: Imitate the proof of Theorem 8.21.) 

12. Let F be a field of characteristic p. 

(i) Prove that (a + b) p — a p + b p for all a, b G F. 

(ii) Prove that the function (f>: F —> F defined by <j)(a) — a p (for all 
a G F) is a ring homomorphism. 

(in) Prove that x = 1 is the only solution in F of the equation x p — 1. 

3 

13. Let F be a field of characteristic p such that the polynomial X p — X 

has p 3 distinct roots in F. Prove that these roots form a subfield of F 
with p 3 elements. (Hint: Use Theorem 5.3.) 

14. Let F be a field and f(X) G F[X] be irreducible polynomial. Let 
Q = Z P [X]/I where I = f(X)F[X} and let a = 7 + 1 G Q. Thus 
Q is an extension field of F and a a zero of f(Y) in Q[U]. 

(i) Prove that f(Y) = (Y — a)g(Y) for some g(Y) G Q[Y}. 


(ii) 

(in) 
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(ii) Suppose that deg(g) > 1 and let h(Y) be an irreducible factor of 
g(Y) in Q\Y\. Prove that there exists an extension field E of Q in 
which h has a zero. 

(Hi) Prove the field E in the previous part is an extension of F in which 
the polynomial / has at least two zeros. 

(iv) Explain why there must exist an extension of F in which / has 
deg(/) zeros. Show that this is also true if / is not irreducible. 

(Hint: Consider the irreducible factors of / separately.) 

15. Use the previous two exercises to prove the existence of a field with p 3 
elements, and generalize the argument to prove the existence of a field 
with p k elements for any k. 
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